About three years ago I was working on a Salesforce implementation for a campaigning charity. My role was to develop a SQL Server Operational Data Store that mirrored the data in their Salesforce instance. After considering a number of options we decided to develop our own download interface to regularly refresh the SQL Server database from Salesforce. Initially this was developed using Talend as the middleware, which worked pretty well for daily incremental downloads of changed Salesforce records. However, for the initial drawdown of data at go live, which would be close to 100M rows, the approach was simply too slow. So I ended up producing a second one off interface in .Net that cut out Talend and directly accessed the Salesforce Bulk Query API. (At the time the Talend Salesforce connectors did not support the Bulk Query API, though I believe they do now). This interface was also used extensively to test the data conversion process prior to go live.
After the project was closed out I was interested in taking the best elements of both interfaces and producing a standalone application that would allow fast, reliable downloads from Salesforce. At the same time I would address some of the issues. Particularly the tedious management of metadata. Since I was developing the ODS while the Salesforce data model was in constant flux I was forever changing the mirrored data model on SQL. Ideally I’d wanted an interface that dealt with building and maintaining the SQL side data model automatically.
Predictably subsequent work and other commitments meant I had little time to work on this. So it’s taken three years on and off to get a usable product up and running but I now have a version 1 of Salesforce Download. In this time I’ve not been working with any Salesforce clients and so have decided to make it available as an open source project for anyone who may find it useful. You can check out some demonstrations here.