<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xx="categories" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/"><channel><title>Data - Ken Muse</title><atom:link href="https://www.kenmuse.com/categories/data/rss/" rel="self" type="application/rss+xml"/><link>https://www.kenmuse.com/categories/data/</link><description>Discover Azure, DevOps, and development insights with Ken Muse, a DevOps Architect at GitHub and 4x Microsoft Azure MVP</description><language>en-us</language><sy:updatePeriod>weekly</sy:updatePeriod><sy:updateFrequency>1</sy:updateFrequency><image><title>Data - Ken Muse</title><link>https://www.kenmuse.com/categories/data/</link><width>32</width><url>https://www.kenmuse.com/categories/data/favicon/favicon-32x32.png</url><height>32</height></image><atom:link href="https://www.kenmuse.com/categories/data/rss/index.xml" rel="self" type="application/rss+xml"/><item><title>Doing DevOps With Databricks</title><link>https://www.kenmuse.com/blog/databricks-devops/</link><pubDate>Thu, 03 Nov 2022 00:00:00 -0400</pubDate><guid isPermaLink="false">databricks-devops</guid><category>Azure</category><category>Data</category><category>DevOps</category><description>&lt;p&gt;Databricks is an exciting and powerful platform for creating solutions that can process big data into actionable content. Originally, the platform lacked several important aspects that are necessary to fully automate the platform. This historically limited the ability to integrate it into a holistic DevOps practice. Thankfully, those days are long past. Those limitations have been removed, making it possible to utilize the platform more fully.&lt;/p&gt;
&lt;p&gt;There are three parts to the DevOps story with this platform: infrastructure, notebooks, and jobs. Over the last few years, the Databricks team has worked hard to build up the DevOps story around these aspects. Today, we&amp;rsquo;ll explore some of those features!&lt;/p&gt;</description><enclosure type="image/png" url="https://www.kenmuse.com/blog/databricks-devops/images/banner.png"/></item><item><title>Implementing DevOps for Azure Data Factory</title><link>https://www.kenmuse.com/blog/implementing-devops-for-azure-data-factory/</link><pubDate>Thu, 27 Oct 2022 00:00:00 -0400</pubDate><guid isPermaLink="false">implementing-devops-for-azure-data-factory</guid><category>Azure</category><category>Data</category><description>&lt;p&gt;There&amp;rsquo;s a lot of documentation around using Azure Data Factory, but surprisingly little on implementing DevOps practices. There&amp;rsquo;s even less when it comes to implementing an automated workflow. Typically, the system expects you to design and build the pipelines within the provided user interface, then press Publish.&lt;/p&gt;
&lt;h2 id="the-native-cicd-process"&gt;
&lt;a class="heading-link" href="#the-native-cicd-process"&gt;The Native CI/CD Process&lt;span class="fa-solid fa-link" aria-hidden="true"&gt;&lt;/span&gt;&lt;/a&gt;
&lt;/h2&gt;&lt;p&gt;Under the covers, edits in ADF create a series of JSON files which stores the configuration. Changes in the portal are stored to these files. When the configuration is published, those JSON documents are used to generate ARM templates. If ADF is configured to use a Git repository, a copy of the published templates is pushed to the publish branch (typically, &lt;code&gt;adf_publish&lt;/code&gt;). This branch can then be reused for automation and deployment to other environments.&lt;/p&gt;</description><enclosure type="image/png" url="https://www.kenmuse.com/blog/implementing-devops-for-azure-data-factory/images/banner.png"/></item><item><title>Azure Data Factory DevOps</title><link>https://www.kenmuse.com/blog/azure-data-factory-devops/</link><pubDate>Thu, 20 Oct 2022 00:00:00 -0400</pubDate><guid isPermaLink="false">azure-data-factory-devops</guid><category>Azure</category><category>Data</category><description>&lt;p&gt;It&amp;rsquo;s been fun spending time re-exploring the data platform this week. After months away from the world of big data, I&amp;rsquo;ve been amazed at how far the tools have come. Two years ago, the support for DevOps tools and practices was very limited. Now, it&amp;rsquo;s substantially more robust. Don&amp;rsquo;t get me wrong — you could create workable solutions. It was just harder than necessary.&lt;/p&gt;
&lt;p&gt;Azure Data Factory (ADF) was a surprising entry into the big data space. Microsoft actually started the project as code-first, so there was a level of support for DevOps from the beginning. It generates code that can be edited and modified by users, and the environment could be integrated with Git. It wasn&amp;rsquo;t a perfect solution, but it was great to see them thinking about the problem.&lt;/p&gt;</description><enclosure type="image/png" url="https://www.kenmuse.com/blog/azure-data-factory-devops/images/banner.png"/></item><item><title>Azure SQL Database Ledger</title><link>https://www.kenmuse.com/blog/azure-sql-database-ledger/</link><pubDate>Wed, 23 Mar 2022 00:00:00 -0400</pubDate><guid isPermaLink="false">azure-sql-database-ledger</guid><category>Azure</category><category>Data</category><description>Need to prove that your database was not tampered with? Azure SQL Database Ledger can help! Learn how a database blockchain can help!</description><enclosure type="image/jpeg" url="https://www.kenmuse.com/blog/azure-sql-database-ledger/images/banner.jpg"/></item><item><title>Intro to Data Lake Storage</title><link>https://www.kenmuse.com/blog/understanding-data-lake-storage/</link><pubDate>Wed, 06 Oct 2021 00:00:00 -0400</pubDate><guid isPermaLink="false">understanding-data-lake-storage</guid><category>Azure</category><category>Data</category><description>Learn how Azure Data Lake Storage Gen2 enhances the experience and creates a foundation for big data.</description><enclosure type="image/jpeg" url="https://www.kenmuse.com/blog/understanding-data-lake-storage/images/banner.jpg"/></item><item><title>Understanding Modern Data Warehouse Storage</title><link>https://www.kenmuse.com/blog/understanding-modern-data-warehouse-storage/</link><pubDate>Mon, 04 Oct 2021 00:00:00 -0400</pubDate><guid isPermaLink="false">understanding-modern-data-warehouse-storage</guid><category>Azure</category><category>Data</category><description>Learn the options and considerations for storing data in a modern data warehouse.</description><enclosure type="image/jpeg" url="https://www.kenmuse.com/blog/understanding-modern-data-warehouse-storage/images/banner.jpg"/></item><item><title>Modern Data Warehouse Ingestion</title><link>https://www.kenmuse.com/blog/modern-data-warehouse-ingestion/</link><pubDate>Mon, 06 Sep 2021 00:00:00 -0400</pubDate><guid isPermaLink="false">modern-data-warehouse-ingestion</guid><category>Azure</category><category>Data</category><description>&lt;p&gt;Data ingestion is the first step in the journey of creating a modern data warehouse. Essentially, this is the process of transporting data from one or more sources to storage, allowing it to be accessed and analyzed. The ingestion process is sometimes split into these two aspects, the service responsible for receiving the content and the storage solution. We&amp;rsquo;ll explore both aspects in this series.&lt;/p&gt;
&lt;p&gt;Ingestion is the foundation of data processing and analytics. Although often the least discussed, it is often the component most directly responsible for determining the performance and cost of the overall system. The reason for this is tied to the fact that dealing with big data is a trade off between compute, storage, and latency. These tradeoffs require careful balancing, selecting the tool or technology that is optimized for the problem being solved.&lt;/p&gt;</description><enclosure type="image/jpeg" url="https://www.kenmuse.com/blog/modern-data-warehouse-ingestion/images/banner.jpg"/></item><item><title>Introduction to the Modern Data Warehouse</title><link>https://www.kenmuse.com/blog/introduction-to-the-modern-data-warehouse/</link><pubDate>Mon, 09 Nov 2020 00:00:00 -0500</pubDate><guid isPermaLink="false">introduction-to-the-modern-data-warehouse</guid><category>Azure</category><category>Data</category><description>&lt;p&gt;In the past, the traditional data storage mechanisms were often cleanly divided between file storage, NoSQL and relational transactions, and data warehouses. The data warehouse was often a monolithic system, servicing the needs of both customers and internal stakeholders. With the explosion of data, the days of the single-system approaches have come to an end. For the modern data practitioner, it&amp;rsquo;s critical to consider the advantages of a cloud-hosted environment to dynamically support the growing data storage needs. As a result, you often find yourself having to rely on the strengths of multiple different components rather than any one single system. Over time, patterns have emerged which optimize this approach and ensure it remains manageable. The dominant approach is the &lt;em&gt;Modern Data Warehouse&lt;/em&gt; (MDW).&lt;/p&gt;</description><enclosure type="image/jpeg" url="https://www.kenmuse.com/blog/introduction-to-the-modern-data-warehouse/images/banner.jpg"/></item></channel></rss>