<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Reproducible Research on Kailas Venkitasubramanian</title>
    <link>/categories/reproducible-research/</link>
    <description>Recent content in Reproducible Research on Kailas Venkitasubramanian</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Thu, 10 Apr 2025 00:00:00 +0000</lastBuildDate>
    <atom:link href="/categories/reproducible-research/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Embracing multilingualism in data science</title>
      <link>/blog/series/reproducible-research-series/2025-04-10-multilingualism-in-data-science/</link>
      <pubDate>Thu, 10 Apr 2025 00:00:00 +0000</pubDate>
      <guid>/blog/series/reproducible-research-series/2025-04-10-multilingualism-in-data-science/</guid>
      <description>&lt;p&gt;Both of those efforts — reproducibility and pipelines — rest on a more basic question: which programming languages should a small research team actually use? In the previous posts of this series, I covered &#xA;&lt;a href=&#34;/blog/series/reproducible-research-series/2022-07-08-building-blocks-of-a-reproducible-research-framework/&#34;&gt;why reproducibility matters&lt;/a&gt; and how we are &#xA;&lt;a href=&#34;/blog/series/reproducible-research-series/2022-04-10-designing-reproducible-data-pipelines/&#34;&gt;designing reproducible data pipelines&lt;/a&gt; at the UNC Charlotte Urban Institute. This post is about the layer underneath both.&lt;/p&gt;&#xA;&lt;p&gt;Specifically, I want to argue that embracing &lt;em&gt;multilingualism&lt;/em&gt;&amp;mdash;fluency in both R and Python, rather than loyalty to one&amp;mdash;has quietly done more for our team&amp;rsquo;s output than almost any other choice we&amp;rsquo;ve made.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Designing Reproducible Data Pipelines for Community Research</title>
      <link>/blog/series/reproducible-research-series/2022-04-10-designing-reproducible-data-pipelines/</link>
      <pubDate>Sat, 09 Mar 2024 00:00:00 +0000</pubDate>
      <guid>/blog/series/reproducible-research-series/2022-04-10-designing-reproducible-data-pipelines/</guid>
      <description>&lt;p&gt;In the first post of this series, I argued that reproducibility is not a technical luxury for community research institutions—it is an ethical and operational obligation. In this post, I want to move from philosophy to plumbing—because this is where reproducibility becomes real.&lt;/p&gt;&#xA;&lt;p&gt;Specifically: what does it mean to design &lt;em&gt;reproducible data pipelines&lt;/em&gt; in a community research environment?&lt;/p&gt;&#xA;&lt;p&gt;At the UNC Charlotte Urban Institute, this question became concrete as we built the &lt;strong&gt;Quality of Life Explorer&lt;/strong&gt;, developed deposit and extraction pipelines for the &lt;strong&gt;Charlotte Regional Data Trust&lt;/strong&gt;, and began orchestrating workflows using Apache Airflow in an AWS environment.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Releasing v1.0 of the Charlotte-Mecklenburg Quality of Life Explorer Data Pipeline</title>
      <link>/blog/posts/2023-03-16-releasing-v1-0-of-the-charlotte-mecklenburg-quality-of-life-explorer-data-pipeline/</link>
      <pubDate>Thu, 16 Mar 2023 00:00:00 +0000</pubDate>
      <guid>/blog/posts/2023-03-16-releasing-v1-0-of-the-charlotte-mecklenburg-quality-of-life-explorer-data-pipeline/</guid>
      <description>&lt;script src=&#34;/blog/posts/2023-03-16-releasing-v1-0-of-the-charlotte-mecklenburg-quality-of-life-explorer-data-pipeline/index_files/fitvids/fitvids.min.js&#34;&gt;&lt;/script&gt;&#xD;&#xA;&#xA;&#xA;&#xA;&#xA;&lt;h2 id=&#34;a-successful-wrap&#34;&gt;A successful wrap&#xA;  &lt;a href=&#34;#a-successful-wrap&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;&#xA;      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;&#xA;      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;&#xA;    &lt;/svg&gt;&lt;/a&gt;&#xA;&lt;/h2&gt;&#xA;&lt;p&gt;It’s a great day today.&lt;/p&gt;&#xA;&lt;p&gt;We completed the version 1.0 of the Charlotte-Mecklenburg &#xA;&lt;a href=&#34;/project/qol-data-pipeline-automation/&#34;&gt;Quality of Life Explorer automation&lt;/a&gt; project. It’s hard to put in words how exciting and rewarding this feels but I’ll try anyways.&lt;/p&gt;&#xA;&lt;p&gt;We managed to automate most of the data and computational processes needed to generate the 80-odd quality of life indicators featured in the explorer and create functional data pipelines to serve the application. Through this work, we’ve accomplished a significant reduction of our project workload and fundamentally transformed the nature of our engagement in this project. The completion of this work also revitalizes our team’s vision to building a &#xA;&lt;a href=&#34;/project/ui-reproducibility-project&#34;&gt;reproducible data science framework&lt;/a&gt; at the Urban Institute and a &#xA;&lt;a href=&#34;/talk/towards-reproducible-data-science&#34;&gt;unified data ecosystem&lt;/a&gt;. Let me tell you how all this worked out.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Towards reproducible data science for community and policy research - An experiential roadmap</title>
      <link>/talk/towards-reproducible-data-science/</link>
      <pubDate>Mon, 06 Mar 2023 00:00:00 +0000</pubDate>
      <guid>/talk/towards-reproducible-data-science/</guid>
      <description>On developing a reproducible data science framework and practice at the Charlotte Urban Insitute</description>
    </item>
    <item>
      <title>Charlotte Regional Data Trust - Technical Operations Manual</title>
      <link>/talk/tech-operations-manual/</link>
      <pubDate>Sat, 06 Aug 2022 00:00:00 +0000</pubDate>
      <guid>/talk/tech-operations-manual/</guid>
      <description>On how we developed the technical operations manual at the Charlotte Regional Data Trust</description>
    </item>
  </channel>
</rss>
