After two BI projects sourcing from Workday, I have researched, experimented and used a number of different tools to pull data out of Workday. To get started, let us look at some of the options available.
- Workday Studio – An Eclipse based development environment for advanced integration development.
- Core Connectors – Pre-built integration templates that extend Workday functionality to external endpoints.
- Public Workday APIs – Open standards-based SOAP API for programmatic access to Workday’s On-Demand business Management Services.
- Custom Reports – Using Report Writer from within Workday, you can create custom reports. Custom reports can be ran in Workday, exposed as web services otherwise known as Reporting as a Service (RaaS), or fed to integrations and delivered as files.
Now in English…
- Workday Studio – At first blush this looks like a cloud based ETL tool, but it is not really. It uses Public APIs and Custom reports (RaaS) as its data sources. The transform stages available basically convert formats of the integration output from XML to JSON or XML to PDF for example. It offers functionality to do things like compress, encrypt, and ftp integrations. The closest thing to ETL transform features would be the ability to split and combine integration outputs.
The functionality offered is really geared towards integration activities, and is probably not suitable for ETL. It is a decent tool for running a custom report, converting the format, encrypting and compressing the file, sending the file to an outbound vendor or customer, logging some events, and sending a notification that the integration was sent successfully.
My general recommendation to a BI team is to NOT use this tool, instead just get your data through one of the other means mentioned here. Then use your ETL tool, scripts, or SQL for any data manipulation.
- Core Connectors – Are basically user friendly ways to access the Public APIs (although not as friendly as I would like). It is a web interface in Workday with checkboxes to pick the sections of data you want, some basic mapping capabilities, and the ability to run change audits (which act like CDC).
Again, I would just access the Public APIs directly, if you have the access and some technical chops. The interface is not as user friendly as I would like, and I would rather do CDC and mapping operations in the ETL tool my team is already familiar with.
- Public APIs – Deliver standard sets of data in XML format with the ability to choose sections of data returned. Some allow filtering by datetime or individual reference ID. Other than that filtering is limited. Data is returned by pages with a max of 999 entries, which helps with performance and timeouts. Also, a version number is part of the input, which is actually used to supply data. This means you can access previous versions of the public API, which in theory lessons the impact of Workday updates to your nightly load.
Overall this is a pretty good option for BI. Navigating and understanding the large XML files that are returned can be a little challenging. You also need to be sure you understand what the data being returned is, and how it relates to what users/report writes see in Workday.
- Custom Reports – At first seem really easy, but can get a bit complicated with advanced use. Workday may force you to use certain prompts for particular objects. This has the potential to make certain objects hard to get wide swaths of data needed for BI loads (the impact of this has been low in Workday HCM). Custom reports can be turned into web services easily with the click of a check box. You can search for objects and see descriptions right in the reporting interface. Filters and Prompts can be added.
This is another good option. In the two implementations I’ve worked on and others that I have heard of, this has been the predominant method used. It is easier to find the objects and data you need than the public APIs. Also, troubleshooting is easier, since you can add filters and run a copy of the report in Workday.
I do want to add a word of caution. That building custom reports seems very easy and intuitive at first, but can become pretty challenging when working on anything other than basic reports. Also, understanding the Workday Object model and data is a pretty large task.
Now let’s talk a little bit about delivery. You have 2 main options. You can access the data directly using SOAP requests, or you can create a document and deliver it using Enterprise Integration Builder (EIB).
EIB – Fairly painless and intuitive. CSV format can be selected for custom reports. For Public APIs a XSLT template can be created and added to create CSV files. Files can be SFTPed to a DMZ or ETL servers. EIBs will typically need Integration Service Users and special security groups set up. They will also need to be scheduled to run in Workday.
EIBs work well for nightly or infrequent batch loads that can be easily scheduled. If your project timeline is compressed, I would look into this option. This is probably the quickest way to get data in your hands. Since the data is delivered in CSV, there is a performance benefit when loading the data over SOAP requests that come in the form of XML. XML tags make the XML versions larger, and more time consuming to parse.
Keep in mind though, there will be some coordination with ETL development that needs to happen when migrating changes. Added or removed columns can break ETL jobs (depending on tool/code), so generally they will need to be migrated to production at the same time. You will be somewhat limited by the delivered and calculated fields you can pass as values when executing EIBs.
I’ve used this method at one client and it has worked well so far. The only hiccup was when trying to create an advanced schedule that used custom calculated fields to pass a date for restatement of previous months data. The way effective date prompts work, it would not let us pass a value to the calculated field we created for the prompt. We were doing this so we could dynamically generate the data we wanted based on what we passed to the prompt. We did not experience any issues with delivery of reports to the FTP server.
SOAP requests – both Public APIs and RaaS (again Raas is just custom reports created with the ‘Enable as Web Service’ option enabled). Creating ETL jobs and processes to access data via this method may take a little more work/time than versions sourcing from an EIB. Since the data is delivered in XML, the ETL tool will need to parse the data. This means understanding what fields are multi instance, and how you would like to handle them becomes a consideration. The parsing of XML will also, make the ETL load slower compared to a csv file delivered by EIB. You may also need to keep an eye out for report timeouts when loading large sets of data. You may need to get creative and add filters to partition the data into multiple result sets for initial loads. You will also want to make sure you have an incremental load set up for large objects.
A huge benefit to this method is that you get more control over scheduling and flexibility on parameter passing, as the your ETL environment can manage both of these. Basically when you run your nightly ETL load, the ETL process/tool calls the report. Any dates or prompt values that need to be passed can be persisted in tables/files and generated by the ETL script when launching the job.
In my view it basically comes down to Public API vs Custom report and SOAP request vs. EIB.
For most customers especially working with Workday HR, I think Custom Reports delivered using EIB will work fine. If you need near real time data or have tight load dependencies, you may want to consider using RaaS and accessing the data directly via Soap request.
In summary here are the Pro’s and Con’s to each as I see them.
|Data delivered in ‘pages’, which helps with performance and timeouts.||Can be tough to troubleshoot.|
|Ability to pass version as a parameter. This reduces the need to make adjustments to your process along with Workday upgrades.||Can not add custom fields or filters to the report. What you see is what you get.|
|Easier to understand what data you are pulling and how it relates.||May need to break reports up, due to timeouts.|
|Easier to troubleshoot.||Need to be aware of changes to reporting fields. As well as test your process before upgrades.|
|Less development time.||Integrations have to be scheduled in Workday.|
|Faster ETL loads.||Files are delivered. More steps, more chances for things to go wrong.|
|Multi-instance fields are automatically concatenated into a single column.|
|More control from the BI environment.||Need to develop SOAP request process on ETL side.|
|Less Workday objects to manage.||Need to be careful of parsing multi-instance fields.|
|May need to split up large data sets, via additional filters in reports.|