HelpSpot Gets PostgreSQL Support
If you're in the market for a powerful and user friendly Help Desk solution, please take a look at my company's flagship product HelpSpot.Well what can I say, I'm a sucker for customer feedback. Actually, PostgreSQL support has been in my plans from the beginning, but I always envisioned it as more of a V2 feature. This was probably stupid on my part as I'm pretty aware of the database "camps" out there, but I was hoping to get by. However, after several emails from potential customers asking about alternative database support (thanks Bob. Tim, and Mike!) I decided it was better to do the porting now and get it over with. I wouldn't feel comfortable adding support for a new database without a full beta cycle on it and that won't come again until V2 so it seemed best to do it now, hence the reason I've been very silent here the past few days.
First off, this would not even have been possible if I wasn't already using the super fantastic ADODB abstraction library. It's just amazing how much functionality is in that package. However, there were still a few bumps, which I'll point out here since there's not much out there about this stuff. At least, I found very few weblog entries or articles which discussed all of the issues someone might face when porting a MySQL codebase to PostgreSQL. Hopefully this helps someone in the future.
1. Automatic case folding
Ah, this was a big one. HelpSpot's database was written in my own offshoot of Hungarian Notation, meaning the names of tables and fields are camel case with a leading indicator as to the data type of the field. Not a big deal I thought. Having developed for years on MySQL and MS SQL Server this always worked fine. Ah my friends not in the PostgreSQL world.
See the issue is that I always write my queries something like this:
SELECT xProduct, sThing, sOtherThing
FROM HS_ThingTable
WHERE HS_ThingTable.xBlah = 8
No problem under normal conditions. However in PostgreSQL all tables and fields are case folded to lowercase automatically unless surrounded by quotes! So the above query becomes:
SELECT xproduct, sthing, sotherthing
FROM hs_thingtable
WHERE hs_thingtable.xblah = 8
So PostgreSQL ends up looking for table hs_thingtable when it needs to be looking for HS_ThingTable hence it can't find it and the query fails. To force PostgreSQL to obey you need to make the query look like this:
SELECT "xProduct", "sThing"," sOtherThing"
FROM "HS_ThingTable"
WHERE "HS_ThingTable"."xBlah" = 8
Not a big deal you say, just put the quotes in. Ah, that works great for PostgreSQL and MS SQL Server since the quotes are part of the SQL standard, but those crazy guys over at MySQL decided that they would rather use `back ticks` instead so when you try the query with quotes it fails in MySQL :-(
So the other alternative is to make everything lowercase, but I refused to do that because I find it to be much less readable. Also it would be a truly huge amount of work, because internally I reference the recordsets by associative array so there's things like this all over the place: $thingy['sThing'] I'm not about to go and change tens of thousands of references a few weeks before the beta.
Luckily ADODB to the rescue! Using the very cool fnExecute variable I was able to define a function which runs only if PostgreSQL is the current databse type. In that case I do some lightweight query rewriting to add in the necessary quotes. It works perfect and after some quick checks to make sure performance wasn't hurt I was satisfied with the solution and able to both keep my camel case and continue the port.
On a side note, MySQL since sometime in V4 does have an option to set the quote character back to ANSI standard, but since I want to support pre 4 databases that wasn't an option. Also, I believe it's something that must be configured in MySQL which also wasn't acceptable to me, since it's sure to be the type of thing which causes many people headaches during installation.
aside 1: PostgreSQL documentation is terrible, I mean really bad on their site. The search is a total joke. Hence I wouldn't be surprised if someone posts here that all I had to do was X and PostgreSQL would honor the camel case. If so I'd love to know the trick.
aside 2: Damn I love having transactions!!! That MySQL gets away without them is amazing, it actually costs me a alot of time that the MyISAM table type doesn't support them because I have to do more checks to try and maintain at least some consistency in case of a failure between inserts/updates/deletes. Yes I know about the InnoDB type, no I won't use it. To much overhead for the customer to know about plus the loss of fulltext searching. This is another nice thing about ADODB. I can put all the ADODB transaction code in there and it works for supported databases and is ignored in DB's that don't have transcations like MySQL.
2. Last Insert ID
You insert into one table, get the primary key ID of that inserted row and use that in another query on a related table. Happens all the time. A little bit of a pain on PostgreSQL. The good news is that ADODB has a undocumented Postgres function called pg_insert_id which handles it, the only thing is you have to pass it the table name and field name so it can look at the sequence ID and return that value so you'll need to change all your Insert_ID references to the other function when using PostgreSQL.
3. Fulltext search
Full text search won't be supported in HelpSpot V1. Unlike the other platforms the fulltext engine is an addon and must be built. As such, it functions very differently than the internal fulltext support of MySQL and SQL Server. I gave it a quick go, but could see that it was going to be a challenge so this will have to wait. I also didn't really like that the open source project that built the engine seems to have had it's last update in 2003. Perhaps the Postgres people are working on a built in solution????
4. Empty quotes for numeric field
This was the last big hurdle. In several spots of code the INSERT SQL generated for numeric fields simply passed an empty string. MySQL handles this by inserting the tables default value for that field, PostgreSQL throws an error. So I went through and fixed that. It's actually much better this way from a security standpoint anyway and I really should have been doing this all along in every spot.
Whew, so overall it really wasn't too bad. About a day or so of work and I think the changes I made actually helped HelpSpot become a better product. I now also have great coverage by being able to support 3 great databases at least one of which should be acceptable in most IT shops.
Discussion
Why not use underscores in place of spaces? I like to name fields according to what table they are in usually.. A table named "clients" might have fields c_name_first, c_name_last. It has never mattered to me to much to name a field so that the type is easilly recognizable because my naming conventions would give it away anyways.. c_last_login_date (obviously datetime), c_something_count (obvious int) and same is true for any other types.
I like naming it this way mostly because in complex JOINS I would be able to clearly tell that c_added_date is the date the client is added and co_added_date. I've never used those exact fields but can't think of a better example right now.
It might not be the best naming conventions but it sure beats AS400 with names like CLI1345, CLI1346, CL1347B.. Lol, I had the pleasure of making an app that pulled from some legacy databases and believe me if I could go back and time and smack IBM for the horror that is AS400 I would. Speaking of ADODB, In that same app I loved how easilly I could pull from so many different databases. I pulled form 2 versions of Oracle, MySQL, MSSQL, and AS400 all with ADODB. Although some extra stuff did need to be installed to read from AS400, the rest was a simple connection string change.
Is MySQL really case specific? I have used MySQL before but I never noticed since I used lowercase all the time anyways. I don't believe MSSQL cares about it but can't say I ever checked that either.
Created by Collin on 08.26.2005 10:08 pm
I could have gone underscores but I'd still have the same practical problem now, which is that I didn't do it in the beginning so I would still need to go through all the code and change every table and field name. That just seemed like too much risk for this point in time, however, the next app I build will definitely take all this into account.
The thing I don't really like about underscores though is they make the names a bit longer. It's not a big deal, but I'm used to the camel case.
MySQL handles names the same way as MSSQL.
There won't be AS400 support in HelpSpot any time soon
Created by Ian on 08.26.2005 10:08 pm
Would those guys really have not considered HelpSpot because it uses a MySQL DB? It's not like Oracle or SQL Server which cost money to get a hold of.
Created by Michael Sica on 08.26.2005 10:08 pm
Well Michael I think it's a little different dynamic than just that. First off as I said their are definitely camps of DB thought and in the open source world there's pretty defined lines between the MySQL camp and the PostgreSQL camp. However, beyond that you really need to look at the market HelpSpot is in.
HelpSpot is targeted primarily at small to mid-size companies most of whom have some existing IT infrastructure. So if as an IT organization I've made the strategic decision to use PostgreSQL in my shop than an app that doesn't support that has a big negative against it. Not only because I may prefer PostgreSQL, but more practically because now I probably need all different backup scripts and procedures, I have to have people who know how to administer a MySQL server, I have to potential make exceptions to documented processes and so on.
That's really why I had always planned to support it, but I just moved up the time table. At the end of the day even if only a handful of customers need that support, those are sales I'll make that I otherwise would have lost.
From the ISV marketing perspective I also think there's something to be said for being able to support multiple databases. It implies a certain amount of sophistication to the program. That by itself won't sell licenses, but coupled with other factors can be one of those little things that helps tip the balance in your favor.
Created by Ian on 08.26.2005 10:08 pm
I understand your points, but my thought is this. If a company is only going to buy applications that ONLY run on PostgreSQL, won't they have a hard time finding apps in general? MySQL is the big boy on the free database block.
I'm not arguing against supporting multiple DB's, but if I were running a company and there was an app I wanted... as long as it could physically run on my hardware and chosen OS - I'd get it! (If all my servers were Unix I obviously couldn't go with a .Net application running on an Access DB!)
Speaking of DB support... For my app, I built in a DAO layer (Data Access Objects) which handles all the SQL. So if want to support different databases in the future I just need different implementations of those data access objects and I'm golden. I was going to build it for MySQL and Oracle, making one for the small shops and one for the big shops. I've since had a change in mind for my target market and distribution of my app, so I am only developing a MySQL version. (In case you were wondering
)
Created by Michael Sica on 08.26.2005 10:08 pm
Well I think it's not really a matter of won't as much as prefer. If all the other help desk software runs on MySQL (which it pretty much does) but they prefer PostgreSQL then that's a leg up for me. Sure they might buy it anyway if they think my products the best, but I want every leg up I can get! I want to lower the barrier to entry as much as possible.
If the person researching the options can tell the IT head it uses a database they prefer (even if they do in fact run both in their shop) then all the better for me. That point of interaction right there is extremely key. I've personally been in that meeting many times and seen many apps taken off the table, because the IT directory didn't like one of the subsystems an app is based on, even if we did run that app ourselves. Perhaps that's why I'm a bit more sensitive to it.
Also it's nice to be able to off a DB which supports transactions that's OS. I've never really had any issues with my MySQL DB's falling out of sync, but I've usually been running on dedicated hardware. For people who run HelpSpot on a shared hosted server I'm probably going to recommend they use PostgreSQL if available since hosted services DB servers are often under heavy load.
Did you write your own DAO library or it's something Java provides? I've never had much luck with that type of thing. I mean it always works for your basic stuff, but I always get burned on the fringe cases. Maybe that's just me though
Created by Ian on 08.26.2005 10:08 pm
MySQL supports transactions. You have to declare the innodb table type in your create script and you've got transaction support. (http://www.mysql.com/news-and-events/press-release/release_2002_11.html)
"Data Access Object" is more of a design pattern - and just good design in general.
(Here's my 2 second explanation of DAO's.)
Let's say you got a Person object and that Person object needs save or update information about itself to the database. That Person object could have a PersonDAO object inside of it. A call to the Person object's "updateName(Name)" method would delegate down to the PersonDAO object - where the actual "data access" takes place. Consider the Person object as a part of the "business layer" and the PersonDAO is in the "data access layer". In the data access layer you could have different implementations for MySQL, Oracle, SQL Server, etc..... Using good OO design and the factory pattern you could swap out which DB specific objects you needed. Heck, since it's all broken out into clean layers and the business layer doesn't know or care what the DAO's are doing the DAO could be hooked up to a web service or just the plain old file system.
Java is such a large platform that it can be overwhelming when you first get started. Not to mention there is the whole OO thing, and the difference between regular Java classes, Servlets, JSP's, etc.... That's why I program with Java using the Spring Framework. It makes Java easier and fun!
Created by Michael Sica on 08.26.2005 10:08 pm
I know about innoDB, but it's not really a good option because customers using a version of MySQL less than 4.0.3 would have to actively enable it and configure it.
Also you loose the fulltext index capability.
Created by Ian on 08.26.2005 10:08 pm
Ah, I didn't know about the fulltext index feature. That's a shame.
-----
Created by Michael Sica on 08.26.2005 10:08 pm