Enterprise data integration needs are growing exponentially over time, as is the interest in open source technologies and the adoption of open source solutions. With this in mind Talend conducted a survey to define the usage landscape of open source data integration and to profile users of this technology. The data used in this analysis was collected from 1013 survey participants. Responses came primarily from the U.S. (56.5%), followed by Europe (35.2%), with the rest of the responses (8.3%) originating in the rest of the World.
Table of ContentsIn 3Background .................................................................... 3DiverseDataIntegrationProje..c.t..s.....................................4DataIntegrationNeedsandT.o..o.l.s.....................................6OpenSourceDataIntegrationvs.ProprietarySolutions.........8......Enterprise Requirements .................................................... 9Community Support ........................................................ 10CommunityInvolvem.e..n.t...............................................11Conclusion ................................................................... 13
Talend White Paper UsageLandscape - Enterprise Open Source Data Integration
Introduction
Background
Enterprisedataintegrationnesedaregrowingexponentiallyovertime,asistheinterestinopenusrcoetechnologiesandtheadoptionof open source solutions.With this in mind Talend conducted a survey to define the usagelandscape of open source data integration and to profile users ofthis technology. The data used in this analysis was collected from1013 survey participants. Responses came primarily from the U.S.(56.5%), followed by Europe (35.2%), with the rest of the responses(8.3%) originating in the rest of the World.
8%
35% 57%
USEuropeOtherSurvey respondents’ demographics
As companies merge, acquire new applications, and build their ITplatforms by incorporating disparate applications with legacysystems, information systems are becoming more and moreheterogeneous.Asaresult,tdaaintegrationtoolsarenowindispensable if enterprise IT departments are to properly managethe flows of data across the information system.Page 3 of 13
Talend White Paper UsageLandscape - Enterprise Open Source Data Integration
In addition, alternative models of software deploymentsuch asSoftware as a Service (SaaS)and the need for interoperability withpartners, customers, providers, etc., all have an important impacton data integration requirements.The global economy is imposing cost controls on IT Managers, both inData IntegrationThe process of combining data residingterms of staff and software, at a time when data integrationat different sources and providing theuser with a unified view of these data.represents an increasingly larger percentage of the enterprise ITbudget. Asked to do more with less, IT personnel would be betteroff spending cycles on tasks orththean the time consuming manualscripting needed to meet custom requirements. In fact, softwareresources with lower acquisition and operation costs would allow ITManagers to more easily deploy enterprise-grade solutions.In this context, open source lustoions offer a very compellingargument. Open source tools can automate and maintain tasksformerlyrequiringmanualscriptasn,dtheexistingskillsoftheITimplementation team easily transfer to an open source offering. Inaddition, IT departments don’t have to justify significant up-frontfees.
Diverse Data Integration ProjectsDataintegrationisthecollectivertmefortechnologiesthatincludeETL (Extract-Transform-Load) for business intelligence and datawarehousing, and operation data integrationthe flows of dataacross operational applications and systems. These needs can rangefrom high throughput batch transfers of data to near-real-time,trickle-feed data flows.Project TypeConsistent with the global data integration market distributionwhetheropensourceorproprietarmyostofthesurveyparticipants(61.5%) use open source solutions for their ETL projects, inPage 4 of 13
Talend White Paper UsageLandscape - Enterprise Open Source Data Integration
particular for BI, Data warehousing and analytics. This can beattributed to the fact that ETLtishe most mature segment of theentire data integration market.
ETLData LoadingOperational Data Integration: BatchMigrationOperational Data Integration: Real TimeDatabase Synchronization
0% 10% 20% 30% 40% 50% 60% 70%Types of projects for which open sourcedata integration is usedData LoadingData loading (41.9%) and data migration (26.5%) are the second andTahpeplicpartioocnessorofdaltoaabdaisnegfodrataexainmpalenfourth most popular type of project. Both of these are goodiopr r to its deployment.candidatesforopensourcesolutsi,oanstheyaretypicallyone-offs,TDhaeta Mpriogcreastsionof transferring datawith no ongoing purpose that would justify a long-term investmentobtehtewreesnystdeamtsa,bawsietsh,tahpeplipcuartipoonsseoorfin an expensive proprietary tool.replacing a system with another.Data synchronization (19.1%) is also a popular type of projectData SynchronizationTchoensistperoncceyssoonfersetamboltisehingsoudrcaetasconductedbyopensourcdaetaintegrationusers.continually harmonizing the data overtime.Batch vs. Real-TimeOperational data integrationwhether batch or real-timeis also agood fit for open source solutions. As business tempos speed up,real-time and nearly real-time operational data integration projectswill prevail over bulk transfer projects. As of the date of the survey,40% of participants used open source tools to manage their batchoperational data integration tasks, compared to only 22.9% for real-time projectsbut the latter is a much faster growing segment.Page 5 of 13
Talend White Paper UsageLandscape - Enterprise Open Source Data Integration
ETL vs. Operational Data IntegrationTaken together, batch and real-time operational data integrationprojects (62.9%) are slightly better represented than ETL usageshare (61.5%), even though the former market segment is lessmature. And, if we also add in data synchronization, the operationalproject share reaches 82%. The reason for this over-representation issimply that open source tools are particularly appropriate foroperational projects because they meet a number of dataintegration requirements, whereastraditionallyproprietary toolsfocus on ETL. In addition, enterprises that want to diversify theirdataintegrationtoolsareoftendoisucragedbythelicensingcostsofproprietary applications. Open usrcoe solutions offer a greaterbreadth of connectivity and more flexibility in terms of adoption,deployment, and maintenance.
Data Integration Needs and ToolsAlthough software companies e artrying to provide unifiedintegration solution packages, tdhaeta integration needs for mostenterprises are so complex thateythoften need to multiply thenumber and nature of the integration software products they use.
Talend White Paper UsageLandscape - Enterprise Open Source Data Integration
Survey participants proved to use a combination of commercialapplications, open sourcseolutions, and database utilities to meettheir data integration needs.The statistics show that using open source and commercial solutionsin combination is very common (31.2%), and that the two can, anddo, coexist on the same platform. In fact, open source solutions areoften complementary to an existipnrgoprietary solution thatforwhatever reasoncannot address a specific need. In some cases itmay be that it’s not worth the expense of investing in a proprietarysolution extension.The high incidence of database utilities shown in the survey results(53.9%) is as expectedthese utilities are a no-cost solution and areusually included with the databases. Their usefulness, however, islimited to dedicated database usage.Applications are often stacked as needs ariseincreasingconnectivity issueswhether enterprises want their CRM system tocommunicate with their ERP module, or to have their disparatedatabases exchanging information with their home-grown platform.Faced with multiple connectivity issues, enterprises often have nooption other than manual scripting to keep data flowing across theirheterogeneous enterprise systems. This is why the survey resultsrank manual scripting as one of the technologies most frequentlyinvoked (54.7%) by enterprises mtoeet their integration needs.Although this is much higher than commercial (31.2%) packagedtechnologies, it is not surprising that manual scripting is the solutionof choice as it carries the lowest initial cost.Althoughmanualscriptingisoftienntendedtobeashort-termfixfor interchange issues, once in production it often becomes apermanent solution. And, in the end, this simple stop-gap canPage 7 of 13
Talend White Paper UsageLandscape - Enterprise Open Source Data Integration
become an entire home-grown platform. The drawback of handcoding or home-grown platforms surfaces over time in the inevitablemaintenance problems that increase the TCO. The advantage,however, is that it fits a particular need that none of the availablecommercial or open source solutions can meet.
Open Source Data Integration vs. Proprietary SolutionsInanongoingefforttolowertrhediataintegrationsoftwareTCO,many enterprises are now considering open source solutions, notjustforone-timeprojects,butalfsor their ongoing mission-criticalprocesses, to replace or complement their expensive CPU-dependent solutions.
Ease of usePerformanceAvoid lock-inNo licensing costsSource code access
0% 20% 40% 60% 80% 100%Very important Important Neutral Not importantDecision criteriaOpen source solutions are a real alternative to the proprietaryworld. Key players have made major strides toward improving theusability and friendliness of open source technologies, traditionally aweak spot for these applications.In just a few short years, open source has evolved from something“geeky into an enterprise-readsyolution. Today, open sourcesolutionsaresufficientlyfeateu-rich to meet complex userrequirements. The survey results reflect these expectations.Page 8 of 13
Talend White Paper UsageLandscape - Enterprise Open Source Data Integration
Respondents felt most strongly about ease-of-use (59%) andperformance (53.9%) as the most important aspects of an opensource data integration solution.Surprisingly,licensingcostisntohtegatingcriterionforenterprisesturningtoopensourcesolutionIts.actuallycomesfourthafterperformance, ease of use, and no lock-in (42.5%), with only 42.1% ofrespondents considering it very important.Access to the source code comes last on most priority lists whenenterprises are choosing open source tools.It is a common misconception thcaotntrol of the source code isimportant for users of open source software. Most users todayunderstand that open source solutions are as mature as theirproprietarycounterpartsand,thefeorre,don’tfeeltheneedtoenhance the code themselves.Today, open source solutions are advantageously replacing thesource code escrow of proprietary software. However, fewenterprises want to allocate in-house resources (or even have theexpertise) to edit, enhance, anmdaintain their data integrationapplications code.
Enterprise RequirementsAn analysis of the survey data indicates that users expect the sameperformance and enterprise-scale features from open sourcesolutionsthattheypreviouslyfoduonnlyinproprieatryproducts.Inorderofimportancetehsefeaturesinclude:•centralized scheduling and execution dashboard•shared repository•administration tools
Page 9 of 13
Talend White Paper UsageLandscape - Enterprise Open Source Data Integration
70%60%50%40%30%20% Scheduling tool10% DashboardShared repository0% Administration toolEnterprise open source data integration requirementsFirst, 60.5% of respondents want a scheduling tool that lets themconsolidateandcentralizetheircthenicalprocesses.Second,57.8%users need a dashboard to centrally monitor processes as theyexecute. Because enterprise users often work in teams and need toshare data on large-scale projects, 54.9% consider a sharedrepository essential. Finally, 38.4% of enterprise users want anadministration tool to centrally manage users and projects.However, not all companies have enterprise-scale requirements.Single users and SMBs might not need that sort of enterprise-gradefeature.Whatemergesisthatopseonurcesolutionsaddressdiverseneeds for a variety of user profiles, whether large or small.
Community SupportAs shown, enterprises want the same support with open sourcesolutionsthatcommercialaipcpaltionsprovide.Themajordifference lies in the fact that a significant number of open sourceusers (84.9%) would rather call on the community for helpaddressing issues than get support from a dedicated service. Thislets them reduce the cost of support and decrease their dataintegration budget; the return they get from the community isPage 10 of 13
Talend White Paper UsageLandscape - Enterprise Open Source Data Integration
comparable in quality to traditional support from a proprietaryvendor.
Community support (forums, etc.)Email-based or Web-based supportGuaranteed response timesPhone support
0% 20% 40% 60% 80% 100%Community vs. commercial support expectationsOpen source users value the forum and the other community tools attheir disposal, as well as the ease-of-mind that comes from knowingthat there is no pressure to upgrade or to buy new tools. Thecommunityalsotendstobemorseproensivethantraditionalsupportservices and community tools are no-cost to the enterprise.However, enterprise users working on mission critical projects, doneed (and demand) vendor-provided, enterprise-grade technicalsupport. This still represents a mritnyoofthetotalnumberofusersof open source data integration (20.,9b%u)tisafastgrowingproportion.
Community InvolvementTwo-thirds of the respondents sayatththey are willing to activelyparticipate in the community, and nearly half are ready to helpbeta-testopensourceproductOs.pensourcecommunitieshaveareal, live QA lab of thousands at their disposal. Open source usersappreciate getting support from the community and feel at ease insharing their experiences and helping other users solve problems.Getting involved in the communeitnysures the sustainability of the