The Art of Data Science


This post summarizes the Building Data Science Capability in Your Organization event at the OpenGov Hub on January 15, 2016. It was written by Katherine Wikrent and originally appeared on Development Gateway's blog

The use of data science techniques like data mining, machine learning, and data visualization has enabled the DG team to create lasting social impact through projects as diverse as the Results Data Initiative, and the Aid Management Platform. To illuminate the ways in which data analytics can be applied in to achieve better developmental outcomes, Development Gateway, in partnership with Booz Allen Hamilton (BAH), offered a half-day conference on “Building Data Science Capability in Your Organization” in January. The meeting served as a chance for leaders across the development community to discuss strategies for building data science capacity and scaling up existing data analytics projects.

The audience enjoyed morning presentations on applying the principles of big data, M&E, and results-based evidence to improve project outcomes from World Bank Group, the US Environmental Protection AgencyJohnson and Johnson, and BAH. Afternoon sessions involved live demonstrations of three innovative data analytics platforms currently used to help organizations make strategic program management decisions and track performance over time. Premise leverages a mobile app that allows citizens across 30 countries to upload macroeconomic indicator data, intermingles analytical techniques such as crowdsourcing, machine learning, and statistical analysis to provide donors and financial institutions the data they need to make more informed investment decisions; MONITOR GIS allows USAID/Colombia staff to visualize and interact with indicator, funding, and statistical data to improve reporting and programmatic outputs; and BAH’s Sailfish platform uses drag-and-drop tools and an “answers on demand” service to enable users (from all technical levels) to benefit from data science fundamentals like data querying, machine learning, and statistical visualization. To conclude the presentations, DG shared takeaways from the Results Data Initiative’s crosswalk to illuminate how data science can bridge complex analytical methods and on the ground needs.

Throughout the day, attendees had the chance to share their thoughts and engage with the subject matter expert presenters about the present and future of data science. Key takeaways from this dialogue included:

  • All organizations, no matter how well developed their data science teams, face logistical and bureaucratic challenges to obtaining data at disaggregated levels. They also struggle to recruit individuals with the technical and statistical knowledge to both analyze these data and apply an understanding of how to reach non-technical audiences in order to visualize and describe these data effectively. As such, forming support networks, investing in internal team building, and fostering partnerships with stakeholders is key to successfully managing data science efforts.
  • In spite of these logistical complications, the body of micro- and macro-level data is vast, and continues to grow rapidly as the public, private, and nonprofit sectors have grown to appreciate the value of robust data collection. Furthermore, there is an increasing repository of organizations and resources that can locate and analyze data rapidly and cost effectively.
  • In order to employ data science to its full extent, organizations must actively promote capacity development and stimulate curiosity. As one attendee quipped, “We must not hire data scientists; we must build them.”

The collaboration of all presenters and attendees was an important step in creating an open dialogue on the benefits and the future of data science in the field of international development. DG truly values the contribution of all participants, and will tap into these lessons learned as we continue integrating data science methods into our projects. The development community is coming to recognize the returns to data science investments; we look forward to witnessing the diverse ways data analytics will be incorporated into project operations to ensure quality, cost-efficient outcomes across sectors.

More information on the conference can be found here:

Creative Commons License
The OpenGov Hub logo and are licensed under a Creative Commons Attribution-NoDerivs 3.0 Unported License.
Permissions beyond the scope of this license may be available at