<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Open Science on Kailas Venkitasubramanian</title>
    <link>/tags/open-science/</link>
    <description>Recent content in Open Science on Kailas Venkitasubramanian</description>
    <generator>Hugo</generator>
    <language>en</language>
    <lastBuildDate>Thu, 10 Apr 2025 00:00:00 +0000</lastBuildDate>
    <atom:link href="/tags/open-science/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Embracing multilingualism in data science</title>
      <link>/blog/series/reproducible-research-series/2025-04-10-multilingualism-in-data-science/</link>
      <pubDate>Thu, 10 Apr 2025 00:00:00 +0000</pubDate>
      <guid>/blog/series/reproducible-research-series/2025-04-10-multilingualism-in-data-science/</guid>
      <description>&lt;p&gt;Both of those efforts — reproducibility and pipelines — rest on a more basic question: which programming languages should a small research team actually use? In the previous posts of this series, I covered &#xA;&lt;a href=&#34;/blog/series/reproducible-research-series/2022-07-08-building-blocks-of-a-reproducible-research-framework/&#34;&gt;why reproducibility matters&lt;/a&gt; and how we are &#xA;&lt;a href=&#34;/blog/series/reproducible-research-series/2022-04-10-designing-reproducible-data-pipelines/&#34;&gt;designing reproducible data pipelines&lt;/a&gt; at the UNC Charlotte Urban Institute. This post is about the layer underneath both.&lt;/p&gt;&#xA;&lt;p&gt;Specifically, I want to argue that embracing &lt;em&gt;multilingualism&lt;/em&gt;&amp;mdash;fluency in both R and Python, rather than loyalty to one&amp;mdash;has quietly done more for our team&amp;rsquo;s output than almost any other choice we&amp;rsquo;ve made.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Changing CRDT operations under a Cloud</title>
      <link>/blog/series/crdt-telenovela-series/2023-06-09-making-sense-of-data-and-documenting-it/</link>
      <pubDate>Fri, 09 Jun 2023 00:00:00 +0000</pubDate>
      <guid>/blog/series/crdt-telenovela-series/2023-06-09-making-sense-of-data-and-documenting-it/</guid>
      <description>&lt;h2 id=&#34;the-promise-and-peril-of-a-large-contract&#34;&gt;The promise and peril of a large contract&#xA;  &lt;a href=&#34;#the-promise-and-peril-of-a-large-contract&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;&#xA;      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;&#xA;      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;&#xA;    &lt;/svg&gt;&lt;/a&gt;&#xA;&lt;/h2&gt;&#xA;&lt;p&gt;Much of the previous challenges in managing the technical operations at the data trust stemmed from a lack of understanding of the scope and extent of effort for a given piece of work and having no barometer to measure productivity (or the lack of it). This meant that everyone knew that a given piece of work took 1 month to complete, everyone agreed that this delay was not acceptable,but no one really could pinpoint where the bottlenecks were and why they existed.&lt;/p&gt;</description>
    </item>
    <item>
      <title>Plunging into the Data Trust black box, and Deep Cleaning the System</title>
      <link>/blog/series/crdt-telenovela-series/2023-05-20-plunging-into-data-trust/</link>
      <pubDate>Sat, 20 May 2023 00:00:00 +0000</pubDate>
      <guid>/blog/series/crdt-telenovela-series/2023-05-20-plunging-into-data-trust/</guid>
      <description>&lt;h2 id=&#34;diving-into-the-world-of-administrative-data-and-crdt&#34;&gt;Diving into the world of administrative data and CRDT&#xA;  &lt;a href=&#34;#diving-into-the-world-of-administrative-data-and-crdt&#34;&gt;&lt;svg class=&#34;anchor-symbol&#34; aria-hidden=&#34;true&#34; height=&#34;26&#34; width=&#34;26&#34; viewBox=&#34;0 0 22 22&#34; xmlns=&#34;http://www.w3.org/2000/svg&#34;&gt;&#xA;      &lt;path d=&#34;M0 0h24v24H0z&#34; fill=&#34;currentColor&#34;&gt;&lt;/path&gt;&#xA;      &lt;path d=&#34;M3.9 12c0-1.71 1.39-3.1 3.1-3.1h4V7H7c-2.76.0-5 2.24-5 5s2.24 5 5 5h4v-1.9H7c-1.71.0-3.1-1.39-3.1-3.1zM8 13h8v-2H8v2zm9-6h-4v1.9h4c1.71.0 3.1 1.39 3.1 3.1s-1.39 3.1-3.1 3.1h-4V17h4c2.76.0 5-2.24 5-5s-2.24-5-5-5z&#34;&gt;&lt;/path&gt;&#xA;    &lt;/svg&gt;&lt;/a&gt;&#xA;&lt;/h2&gt;&#xA;&lt;p&gt;Administrative data is messy is not much of an adage as much as it is a reality. When I took reins of managing the data infrastructure and analytical operations of Institute for Social Capital or ISC (now called the Charlotte Regional Data Trust) in the middle of 2021, messiness extended beyond data. The dysfunction was deep in how data was collected and organized, the way data operations and analyses were conducted, how information was collected from stakeholders, and how data was disseminated.&lt;/p&gt;</description>
    </item>
    <item>
      <title>UI Reproducibility Project</title>
      <link>/project/ui-reproducibility-project/</link>
      <pubDate>Sat, 31 Dec 2022 00:00:00 +0000</pubDate>
      <guid>/project/ui-reproducibility-project/</guid>
      <description>&lt;div id=&#34;&#34; class=&#34;panelset&#34;&gt;&#xA;  &#xD;&#xA;&lt;div class=&#34;panel&#34;&gt;&#xA;  &lt;div class=&#34;panel-name&#34;&gt;Summary&lt;/div&gt;&#xA;  &#xA;  &lt;p&gt;&#xA;&#xA;&#xA;&#xA;&lt;h5 id=&#34;background&#34;&gt;Background&#xA;  &lt;a href=&#34;#background&#34;&gt;&lt;/a&gt;&#xA;&lt;/h5&gt;&#xA;&lt;p&gt;Diverse research backgrounds, skills and operational practices make our institute versatile and nimble to address research problems that crosses several domains. But they also enabled research analytical practices to remain fragmented and inefficient.&lt;/p&gt;&#xA;&lt;p&gt;The Urban Institute data science team recognized the significance of reproducibility in analytical community research practice on two distinct contexts. 1) Operational efficiency via streamlined use and reuse of data, analytical tools and assets 2) developing a culture of transparency and trust that underpins reproducible research whose products become fully replicable and auditable.&lt;/p&gt;</description>
    </item>
    <item>
      <title>UI Data and Analytics Guide</title>
      <link>/project/ui-data-analytics-guide/</link>
      <pubDate>Wed, 03 Aug 2022 00:00:00 +0000</pubDate>
      <guid>/project/ui-data-analytics-guide/</guid>
      <description>&lt;div id=&#34;&#34; class=&#34;panelset&#34;&gt;&#xA;  &#xD;&#xA;&lt;div class=&#34;panel&#34;&gt;&#xA;  &lt;div class=&#34;panel-name&#34;&gt;Summary&lt;/div&gt;&#xA;  &#xA;  &lt;p&gt;&#xA;&#xA;&#xA;&#xA;&lt;h5 id=&#34;objectives-and-scope&#34;&gt;Objective(s) and Scope&#xA;  &lt;a href=&#34;#objectives-and-scope&#34;&gt;&lt;/a&gt;&#xA;&lt;/h5&gt;&#xA;&lt;p&gt;The project aims to create a comprehensive guide to all operational processes of the the Urban Institute, serving as a primary point of reference for all research staff in managing data and analytical resources of the institute.&lt;/p&gt;&#xA;&lt;p&gt;The manual will be created using Rmarkdown, a tool that allows for the creation of rich, interactive documents. The manual will be hosted as a website that can be easily updated and maintained by team members.&lt;/p&gt;</description>
    </item>
  </channel>
</rss>
