<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>R Statistical Language Archives - Analytica Data Science Solutions</title>
	<atom:link href="https://analyticadss.com/category/programming/r-statistical-language/feed/" rel="self" type="application/rss+xml" />
	<link>https://analyticadss.com/category/programming/r-statistical-language/</link>
	<description>World&#039;s Leading Artificial Inelegance Company</description>
	<lastBuildDate>Sat, 26 Aug 2023 09:33:24 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=7.0</generator>

<image>
	<url>https://analyticadss.com/wp-content/uploads/2020/06/cropped-F.B-Cover-photo_V0.1-02-32x32.png</url>
	<title>R Statistical Language Archives - Analytica Data Science Solutions</title>
	<link>https://analyticadss.com/category/programming/r-statistical-language/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Unleash the Power of Functional Programming in R with the purrr Package</title>
		<link>https://analyticadss.com/unleash-the-power-of-functional-programming-in-r-with-the-purrr-package/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Fri, 14 Apr 2023 18:10:01 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[functional programming]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Rstats]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=6138</guid>

					<description><![CDATA[<p>Introduction Welcome to our comprehensive guide on harnessing the power of the purrr package in R for functional programming. If you’re keen on elevating your R skills, you’re in for a treat. Today, we’ll be delving into the wonders of the purrr package — a lifesaver for functional programming. With the avalanche of data we encounter nowadays, having the [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/unleash-the-power-of-functional-programming-in-r-with-the-purrr-package/">Unleash the Power of Functional Programming in R with the purrr Package</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading has-medium-font-size" id="9372">Introduction</h2>



<p class="wp-block-paragraph" id="dd36">Welcome to our comprehensive guide on harnessing the power of the <code>purrr</code> package in R for functional programming. If you’re keen on elevating your R skills, you’re in for a treat. Today, we’ll be delving into the wonders of the <code>purrr</code> package — a lifesaver for functional programming. With the avalanche of data we encounter nowadays, having the right tools for efficient data wrangling is paramount. If you’ve dabbled in R, you might’ve felt certain built-in functions lacking, especially when grappling with intricate operations.</p>



<p class="wp-block-paragraph" id="a3fa">This is where <code>purrr</code> strides in, offering a plethora of robust tools to fine-tune your code, making it not only clearer but also more sustainable.</p>



<p class="wp-block-paragraph" id="e548">Throughout this article, we’ll journey through the intricacies of the <code>purrr</code> package, elucidate its fundamental functions, and showcase its real-world applicability. We’ll also touch on how it can enrich your experience with R, making it more fruitful. By the time you reach the end, you’ll be well-versed in the magic of <code>purrr</code> and ready to wield its power in your data endeavors. Let’s embark on this insightful voyage into the realm of R and unravel the capabilities of the <code>purrr</code> package.</p>



<hr class="wp-block-separator has-alpha-channel-opacity is-style-dots"/>



<h2 class="wp-block-heading has-medium-font-size" id="e803"><strong>What is functional programming and why is it useful?</strong></h2>



<p class="wp-block-paragraph" id="fd43">Functional programming isn’t merely a way to write code; it’s a philosophical shift that guides how we approach computation. By treating computation as the evaluation of mathematical functions, it foregoes changes to the state and avoids mutable data. Instead, it thrives on pure functions that take given inputs and produce predictable outputs, devoid of side effects. The outcome? Code that’s more modular, predictable, and test-friendly.</p>



<p class="wp-block-paragraph" id="0e06">Now, if you’re working with R, particularly for data manipulation and analysis, functional programming can be a game-changer. It lets you create more coherent and succinct code, and here’s how:</p>



<ol class="wp-block-list">
<li>Enhanced Readability and Sustainability: Decomposing complex procedures into smaller, more digestible functions improves the understandability of your code. Plus, it’s easier to tweak as needed.</li>



<li>Boosted Productivity: By steering clear of traps like global variables, which may lead to unforeseen behaviors and debugging headaches, functional programming saves time and frustration.</li>



<li>Optimized Performance: Embracing functional programming could also enhance the efficiency of your code. It prompts the use of vectorized operations and cuts down on the necessity for explicit loops.</li>
</ol>



<p class="wp-block-paragraph" id="b77b">Eager to tap into these benefits? Read on! We’ll dive into the <code>purrr</code> package, an invaluable asset for adopting functional programming in R. Through its power, you can not only elevate your data manipulation and analysis routines but also bring more enjoyment and effectiveness to your programming journey.</p>



<hr class="wp-block-separator has-alpha-channel-opacity is-style-dots"/>



<h2 class="wp-block-heading has-medium-font-size" id="3fdc">Exploring the <code>purrr </code>package</h2>



<p class="wp-block-paragraph" id="283b">Belonging to the tidyverse collection, the <code>purrr</code>package serves as R’s gateway to functional programming. It’s packed with dynamic functions crafted to ease tasks when working with lists and a variety of data structures. Adopting <code>purrr</code>ensures that your data transformation, summarization, and manipulation processes benefit from a unified and logical syntax.</p>



<p class="wp-block-paragraph" id="382c">Let’s delve into what sets <code>purrr</code>apart:</p>



<ol class="wp-block-list">
<li>Uniformity in Function Naming: One of <code>purrrs’</code> strengths is its organized naming structure, simplifying the task of recalling and employing its functions.</li>



<li>Proficiency with Complex Data Structures: Be it nested lists, data frames, or any layered data structure, <code>purrr</code>stands out in its management capabilities.</li>



<li>Robust Error Management: Real-world data can be messy. <code>purrr</code>lends a hand by equipping you with functions that elegantly tackle errors and unexpected scenarios.</li>



<li>Harmony with <code>tidyverse</code> Companions: A significant advantage is <code>purrr</code>compatibility with renowned <code>tidyverse</code> allies such as <code>dplyr</code>, <code>tidyr</code>, and <code>ggplot2</code>. This cohesion allows for a smoother integration of functional programming into your prevailing data routines.</li>
</ol>



<p class="wp-block-paragraph" id="298a">Keen to commence your <code>purrr</code>journey? Simply fetch it from CRAN and initialize it in your R workspace.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.708335876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="install.packages(&quot;purrr&quot;)
library(purrr)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822" tabindex="0"><code><span class="line"><span style="color: #66D9EF">install.packages</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;purrr&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(purrr)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="51fd">In the next section, we will dive into the core functions provided by the <code>purrr</code> package and demonstrate their usage with practical examples.</p>



<hr class="wp-block-separator has-alpha-channel-opacity is-style-dots"/>



<h2 class="wp-block-heading has-medium-font-size" id="b5b5"><strong>Core functions in purrr</strong></h2>



<p class="wp-block-paragraph" id="3c30">In this section, we will cover some of the most important and widely used functions in the <code>purrr</code> package, along with examples to demonstrate their usage.</p>



<p class="wp-block-paragraph" id="ac1b"> <strong>A. The map() family</strong></p>



<p class="wp-block-paragraph" id="0f09">The <code>map()</code> family of functions is the heart of the <code>purrr</code> package. These functions allow you to apply a function to each element of a list or a vector and return the results in a specified format.</p>



<ul class="wp-block-list">
<li><code>map()</code>: Returns a list.</li>



<li><code>map_lgl()</code>: Returns a logical vector.</li>



<li><code>map_int()</code>: Returns an integer vector.</li>



<li><code>map_dbl()</code>: Returns a double vector.</li>



<li><code>map_chr()</code>: Returns a character vector.</li>



<li><code>map_df()</code>: Returns a data frame.</li>
</ul>



<p class="wp-block-paragraph" id="e748">Example:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.7083282470703125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Define a list of numbers
number_list <- list(1, 2, 3, 4)

# Square each number using map()
squared_numbers <- map(number_list, ~ .x^2)
print(squared_numbers)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822" tabindex="0"><code><span class="line"><span style="color: #88846F"># Define a list of numbers</span></span>
<span class="line"><span style="color: #F8F8F2">number_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF; font-style: italic">list</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Square each number using map()</span></span>
<span class="line"><span style="color: #F8F8F2">squared_numbers </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> map(number_list, </span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> .x</span><span style="color: #F92672">^</span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(squared_numbers)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="89fb"><strong>B. pmap()</strong></p>



<p class="wp-block-paragraph" id="1c29">The <code>pmap()</code> function is used to apply a function to elements of multiple lists simultaneously.</p>



<p class="wp-block-paragraph" id="f317">Example:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.7083282470703125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Define two lists
list1 <- list(1, 2, 3)
list2 <- list(4, 5, 6)

# Add corresponding elements of the two lists using pmap()
sum_list <- pmap(list(list1, list2), ~ ..1 + ..2)
print(sum_list)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822" tabindex="0"><code><span class="line"><span style="color: #88846F"># Define two lists</span></span>
<span class="line"><span style="color: #F8F8F2">list1 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF; font-style: italic">list</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">list2 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF; font-style: italic">list</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">6</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Add corresponding elements of the two lists using pmap()</span></span>
<span class="line"><span style="color: #F8F8F2">sum_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> pmap(</span><span style="color: #66D9EF; font-style: italic">list</span><span style="color: #F8F8F2">(list1, list2), </span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> ..1 </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> .</span><span style="color: #AE81FF">.2</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(sum_list)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="0ca9"><strong>C. safely(), quietly(), and possibly()</strong></p>



<p class="wp-block-paragraph" id="ee39">These functions are used to handle errors and exceptions gracefully while applying a function.</p>



<ul class="wp-block-list">
<li><code>safely()</code>: Returns a list containing the result and any error encountered.</li>



<li><code>quietly()</code>: Returns a list containing the result, any warnings, and any messages.</li>



<li><code>possibly()</code>: Returns a default value if an error is encountered.</li>
</ul>



<p class="wp-block-paragraph" id="a799">Example:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.708335876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Define a list with numbers and a character
mixed_list <- list(1, 2, &quot;a&quot;, 3)

# Define a safely wrapped square function
safe_square <- safely(~ .x^2)

# Apply the safe_square function to the mixed_list
results <- map(mixed_list, safe_square)
print(results)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822" tabindex="0"><code><span class="line"><span style="color: #88846F"># Define a list with numbers and a character</span></span>
<span class="line"><span style="color: #F8F8F2">mixed_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF; font-style: italic">list</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;a&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Define a safely wrapped square function</span></span>
<span class="line"><span style="color: #F8F8F2">safe_square </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> safely(</span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> .x</span><span style="color: #F92672">^</span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Apply the safe_square function to the mixed_list</span></span>
<span class="line"><span style="color: #F8F8F2">results </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> map(mixed_list, safe_square)</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(results)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="cbfb"><strong>D. compact() and compose()</strong></p>



<p class="wp-block-paragraph" id="37ee"><code>compact()</code> is used to remove <code>NULL</code> elements from a list, while <code>compose()</code> allows you to combine multiple functions into a single function.</p>



<p class="wp-block-paragraph" id="a367">Example:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.402778625488281px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Define a list with NULL elements
null_list <- list(1, NULL, 2, NULL, 3)

# Remove NULL elements using compact()
clean_list <- compact(null_list)
print(clean_list)

# Compose two functions: square and increment
square <- function(x) x^2
increment <- function(x) x + 1
square_and_increment <- compose(increment, square)

# Apply the composed function to a number
result <- square_and_increment(3)
print(result)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822" tabindex="0"><code><span class="line"><span style="color: #88846F"># Define a list with NULL elements</span></span>
<span class="line"><span style="color: #F8F8F2">null_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF; font-style: italic">list</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">NULL</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">NULL</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Remove NULL elements using compact()</span></span>
<span class="line"><span style="color: #F8F8F2">clean_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> compact(null_list)</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(clean_list)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Compose two functions: square and increment</span></span>
<span class="line"><span style="color: #A6E22E">square</span><span style="color: #F8F8F2"> </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">function</span><span style="color: #F8F8F2">(x) x</span><span style="color: #F92672">^</span><span style="color: #AE81FF">2</span></span>
<span class="line"><span style="color: #A6E22E">increment</span><span style="color: #F8F8F2"> </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">function</span><span style="color: #F8F8F2">(x) x </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span></span>
<span class="line"><span style="color: #F8F8F2">square_and_increment </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> compose(increment, square)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Apply the composed function to a number</span></span>
<span class="line"><span style="color: #F8F8F2">result </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> square_and_increment(</span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(result)</span></span></code></pre></div>



<p class="wp-block-paragraph">These core functions are just the beginning of what <code>purrr</code> has to offer. In the next section, we will demonstrate how to use these functions to solve real-world problems through practical examples.</p>



<hr class="wp-block-separator has-alpha-channel-opacity is-style-dots"/>



<h2 class="wp-block-heading has-medium-font-size" id="4243">Practical examples with purrr</h2>



<p class="wp-block-paragraph" id="1de5">In this section, we will explore two practical examples that demonstrate how the <code>purrr</code> package can be used to solve real-world problems efficiently.</p>



<p class="wp-block-paragraph" id="3fb2"><strong>A. Example 1: Calculating summary statistics for multiple variables</strong></p>



<p class="wp-block-paragraph" id="3260">Suppose you have a data frame with multiple numerical variables, and you want to calculate summary statistics (mean, median, and standard deviation) for each of these variables.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.40277099609375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load required packages
library(dplyr)
library(purrr)

# Create a sample data frame
data <- data.frame(
  var1 = rnorm(100, mean = 10, sd = 2),
  var2 = rnorm(100, mean = 20, sd = 5),
  var3 = rnorm(100, mean = 30, sd = 3),
  stringsAsFactors = FALSE
)

# Define a list of summary functions
summary_functions <- list(mean = mean, median = median, sd = sd)

# Calculate summary statistics for each variable using nested map functions
summary_stats <- map_dfr(summary_functions, ~ map_dfc(data, .x), .id = &quot;Statistic&quot;)
print(summary_stats)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822" tabindex="0"><code><span class="line"><span style="color: #88846F"># Load required packages</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(dplyr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(purrr)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Create a sample data frame</span></span>
<span class="line"><span style="color: #F8F8F2">data </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">data.frame</span><span style="color: #F8F8F2">(</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #FD971F; font-style: italic">var1</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">rnorm</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">mean</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">10</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">sd</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">),</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #FD971F; font-style: italic">var2</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">rnorm</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">mean</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">20</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">sd</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2">),</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #FD971F; font-style: italic">var3</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">rnorm</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">mean</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">30</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">sd</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">),</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #FD971F; font-style: italic">stringsAsFactors</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">FALSE</span></span>
<span class="line"><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Define a list of summary functions</span></span>
<span class="line"><span style="color: #F8F8F2">summary_functions </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF; font-style: italic">list</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F; font-style: italic">mean</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> mean, </span><span style="color: #FD971F; font-style: italic">median</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> median, </span><span style="color: #FD971F; font-style: italic">sd</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> sd)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Calculate summary statistics for each variable using nested map functions</span></span>
<span class="line"><span style="color: #F8F8F2">summary_stats </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> map_dfr(summary_functions, </span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> map_dfc(data, .x), </span><span style="color: #FD971F; font-style: italic">.id</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Statistic&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(summary_stats)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="ae64"><strong>B. Example 2: Fitting multiple linear models for different subsets of data</strong></p>



<p class="wp-block-paragraph" id="c80a">In this example, we will fit linear models for different subsets of the <code>mtcars</code> dataset based on the number of cylinders. We will use <code>purrr</code> functions to apply the linear model function to each subset and extract the model coefficients.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.402786254882812px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load required packages
library(dplyr)
library(purrr)
library(broom)

# Split the mtcars dataset by the number of cylinders
mtcars_split <- mtcars %>% group_split(cyl)

# Define a function to fit a linear model and extract coefficients
fit_lm <- function(data) {
  model <- lm(mpg ~ wt, data = data)
  coef <- data.frame(tidy(model)) %>%
    select(term, estimate) %>%
    mutate(cyl = unique(data$cyl))
  return(coef)
}

# Apply the fit_lm function to each subset using map_dfr()
model_coefs <- map_dfr(mtcars_split, fit_lm)
print(model_coefs)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822" tabindex="0"><code><span class="line"><span style="color: #88846F"># Load required packages</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(dplyr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(purrr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(broom)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Split the mtcars dataset by the number of cylinders</span></span>
<span class="line"><span style="color: #F8F8F2">mtcars_split </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> mtcars </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> group_split(cyl)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Define a function to fit a linear model and extract coefficients</span></span>
<span class="line"><span style="color: #A6E22E">fit_lm</span><span style="color: #F8F8F2"> </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">function</span><span style="color: #F8F8F2">(data) {</span></span>
<span class="line"><span style="color: #F8F8F2">  model </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">lm</span><span style="color: #F8F8F2">(mpg </span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> wt, </span><span style="color: #FD971F; font-style: italic">data</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> data)</span></span>
<span class="line"><span style="color: #F8F8F2">  coef </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">data.frame</span><span style="color: #F8F8F2">(tidy(model)) </span><span style="color: #F92672">%>%</span></span>
<span class="line"><span style="color: #F8F8F2">    select(term, estimate) </span><span style="color: #F92672">%>%</span></span>
<span class="line"><span style="color: #F8F8F2">    mutate(</span><span style="color: #FD971F; font-style: italic">cyl</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">unique</span><span style="color: #F8F8F2">(data</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">cyl))</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">return</span><span style="color: #F8F8F2">(coef)</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Apply the fit_lm function to each subset using map_dfr()</span></span>
<span class="line"><span style="color: #F8F8F2">model_coefs </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> map_dfr(mtcars_split, fit_lm)</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(model_coefs)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="8a7a"><strong>C. Reading Multiple CSV files with purrr</strong></p>



<p class="wp-block-paragraph" id="1f79">Suppose you have multiple CSV files in a directory and you want to read them all into a single data frame using <code>purrr</code>. Here’s an example of how you can achieve this:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.402801513671875px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Define the directory containing the CSV files
csv_directory <- &quot;path/to/your/csv/files&quot;

# List all CSV files in the directory
csv_files <- list.files(csv_directory, pattern = &quot;*.csv&quot;, full.names = TRUE)

# Define a function to read a CSV file and add a column with the filename
read_csv_with_filename <- function(file) {
  data <- read_csv(file)
  data <- data %>% mutate(filename = basename(file))
  return(data)
}

# Read all CSV files using map_dfr() and bind the results into a single data frame
combined_data <- map_dfr(csv_files, read_csv_with_filename)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822" tabindex="0"><code><span class="line"><span style="color: #88846F"># Define the directory containing the CSV files</span></span>
<span class="line"><span style="color: #F8F8F2">csv_directory </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;path/to/your/csv/files&quot;</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># List all CSV files in the directory</span></span>
<span class="line"><span style="color: #F8F8F2">csv_files </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">list.files</span><span style="color: #F8F8F2">(csv_directory, </span><span style="color: #FD971F; font-style: italic">pattern</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;*.csv&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">full.names</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Define a function to read a CSV file and add a column with the filename</span></span>
<span class="line"><span style="color: #A6E22E">read_csv_with_filename</span><span style="color: #F8F8F2"> </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">function</span><span style="color: #F8F8F2">(file) {</span></span>
<span class="line"><span style="color: #F8F8F2">  data </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> read_csv(file)</span></span>
<span class="line"><span style="color: #F8F8F2">  data </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> data </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> mutate(</span><span style="color: #FD971F; font-style: italic">filename</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">basename</span><span style="color: #F8F8F2">(file))</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">return</span><span style="color: #F8F8F2">(data)</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Read all CSV files using map_dfr() and bind the results into a single data frame</span></span>
<span class="line"><span style="color: #F8F8F2">combined_data </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> map_dfr(csv_files, read_csv_with_filename)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="9b39">In this example, we first list all the CSV files in the specified directory. Then, we define a custom function <code>read_csv_with_filename()</code> to read each CSV file and add a column with the filename. Finally, we use <code>purrr</code>‘s <code>map_dfr()</code> function to apply the custom function to each file in the list and bind the results into a single data frame.</p>



<p class="wp-block-paragraph" id="bd14"><strong>D. purrr and ggplot2</strong></p>



<p class="wp-block-paragraph" id="78fa">In this example, we’ll demonstrate how to use <code>purrr</code> to create multiple ggplots for different subsets of data within a single data frame. We’ll use the <code>mtcars</code> dataset and create separate ggplots for each unique number of cylinders.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" data-code-block-pro-font-family="Code-Pro-JetBrains-Mono" style="font-size:.875rem;font-family:Code-Pro-JetBrains-Mono,ui-monospace,SFMono-Regular,Menlo,Monaco,Consolas,monospace;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.40277099609375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load required packages
library(purrr)
library(ggplot2)
library(dplyr)
library(cowplot)

# Create a list of data frames, one for each unique number of cylinders in the mtcars dataset
data_list <- mtcars %>%
  split(.$cyl)

# Define a function to create a ggplot for a given data frame
create_ggplot <- function(data) {
  ggplot(data, aes(x = mpg, y = hp)) +
    geom_point(aes(color = factor(gear)), size = 3) +
    labs(title = paste(&quot;Number of Cylinders:&quot;, unique(data$cyl)),
         x = &quot;Miles per Gallon&quot;,
         y = &quot;Horsepower&quot;) +
    theme_minimal() +
    theme(legend.title = element_blank()) +
    scale_color_discrete(name = &quot;Gears&quot;)
}

# Create a list of ggplots using map()
ggplot_list <- data_list %>% 
  map(create_ggplot)

# Combine the ggplots into a single plot using cowplot's plot_grid()
combined_plot <- plot_grid(plotlist = ggplot_list, ncol = 1, align = &quot;v&quot;, rel_heights = c(1, 1, 1))

# Display the combined plot
print(combined_plot)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822" tabindex="0"><code><span class="line"><span style="color: #88846F"># Load required packages</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(purrr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(ggplot2)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(dplyr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(cowplot)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Create a list of data frames, one for each unique number of cylinders in the mtcars dataset</span></span>
<span class="line"><span style="color: #F8F8F2">data_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> mtcars </span><span style="color: #F92672">%>%</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #66D9EF">split</span><span style="color: #F8F8F2">(.</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">cyl)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Define a function to create a ggplot for a given data frame</span></span>
<span class="line"><span style="color: #A6E22E">create_ggplot</span><span style="color: #F8F8F2"> </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">function</span><span style="color: #F8F8F2">(data) {</span></span>
<span class="line"><span style="color: #F8F8F2">  ggplot(data, aes(</span><span style="color: #FD971F; font-style: italic">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> mpg, </span><span style="color: #FD971F; font-style: italic">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> hp)) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">    geom_point(aes(</span><span style="color: #FD971F; font-style: italic">color</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">factor</span><span style="color: #F8F8F2">(gear)), </span><span style="color: #FD971F; font-style: italic">size</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">    labs(</span><span style="color: #FD971F; font-style: italic">title</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">paste</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;Number of Cylinders:&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #66D9EF">unique</span><span style="color: #F8F8F2">(data</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">cyl)),</span></span>
<span class="line"><span style="color: #F8F8F2">         </span><span style="color: #FD971F; font-style: italic">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Miles per Gallon&quot;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">         </span><span style="color: #FD971F; font-style: italic">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Horsepower&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">    theme_minimal() </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">    theme(</span><span style="color: #FD971F; font-style: italic">legend.title</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> element_blank()) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">    scale_color_discrete(</span><span style="color: #FD971F; font-style: italic">name</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Gears&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Create a list of ggplots using map()</span></span>
<span class="line"><span style="color: #F8F8F2">ggplot_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> data_list </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> </span></span>
<span class="line"><span style="color: #F8F8F2">  map(create_ggplot)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Combine the ggplots into a single plot using cowplot&#39;s plot_grid()</span></span>
<span class="line"><span style="color: #F8F8F2">combined_plot </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> plot_grid(</span><span style="color: #FD971F; font-style: italic">plotlist</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> ggplot_list, </span><span style="color: #FD971F; font-style: italic">ncol</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">align</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;v&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">rel_heights</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">))</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Display the combined plot</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(combined_plot)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="674e">In this example, we first create a list of data frames, one for each unique number of cylinders in the <code>mtcars</code> dataset. Then, we define a custom function <code>create_ggplot()</code> to create a ggplot for a given data frame. The function creates a scatterplot of miles per gallon (mpg) versus horsepower (hp), with a title that reflects the number of cylinders.</p>



<p class="wp-block-paragraph" id="1d4e">Finally, we use <code>purrr</code>‘s <code>map()</code> function to apply the custom function to each data frame in the list, resulting in a list of ggplots. We use a for loop to display each ggplot.</p>



<p class="wp-block-paragraph" id="4483">The plot we get can be seen below:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="720" height="663" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2023/04/1_0zvCrChg56Sq3kBk3neGSg.webp" alt="" class="wp-image-6139" srcset="https://analyticadss.com/wp-content/uploads/2023/04/1_0zvCrChg56Sq3kBk3neGSg.webp 720w, https://analyticadss.com/wp-content/uploads/2023/04/1_0zvCrChg56Sq3kBk3neGSg-500x460.webp 500w, https://analyticadss.com/wp-content/uploads/2023/04/1_0zvCrChg56Sq3kBk3neGSg-150x138.webp 150w" sizes="auto, (max-width: 720px) 100vw, 720px" /></figure>
</div>


<p class="wp-block-paragraph" id="aa70">In this example, we’ve made some changes to the <code>create_ggplot()</code> function to improve the aesthetics of the plots:</p>



<ol class="wp-block-list">
<li>We use <code>geom_point(aes(color = factor(gear)), size = 3)</code> to color the points by the number of gears and increase their size.</li>



<li>We apply <code>theme_minimal()</code> to use a minimalistic theme for the plots.</li>



<li>We remove the legend title using <code>theme(legend.title = element_blank())</code>.</li>



<li>We rename the color scale to “Gears” using <code>scale_color_discrete(name = "Gears")</code>.</li>
</ol>



<p class="wp-block-paragraph" id="f475">Finally, we use the <code>plot_grid()</code> function from the <code>cowplot</code> package to combine the ggplots in the <code>ggplot_list</code> into a single plot with one column and display the combined plot.</p>



<p class="wp-block-paragraph" id="2900">These examples showcase how the <code>purrr</code> package can help you write more efficient and readable code, making your data analysis workflows more robust and maintainable. By incorporating <code>purrr</code> into your R projects, you can take full advantage of functional programming techniques and harness their power to solve complex problems.</p>



<hr class="wp-block-separator has-alpha-channel-opacity is-style-dots"/>



<h2 class="wp-block-heading has-medium-font-size" id="5467">Tips and Best Practices for Using purrr</h2>



<p class="wp-block-paragraph" id="e68e">In this final section, we will share some tips and best practices for using the <code>purrr</code> package in your R projects. These recommendations will help you write more efficient, readable, and maintainable code.</p>



<p class="wp-block-paragraph" id="6bbd"><strong>1. Use anonymous functions when appropriate</strong></p>



<p class="wp-block-paragraph" id="5346">When using <code>map()</code> functions, you can create anonymous functions using the <code>~</code> notation, which allows for concise and readable code. However, if the function becomes too complex or is used multiple times, consider defining it as a separate named function for better code organization and readability.</p>



<p class="wp-block-paragraph" id="aafb"><strong>2. Leverage the power of function composition</strong></p>



<p class="wp-block-paragraph" id="d747">The <code>compose()</code> function allows you to create new functions by combining existing ones. This technique promotes code reusability and makes it easier to build complex functionality by breaking it down into simpler, more manageable parts.</p>



<p class="wp-block-paragraph" id="76d8"><strong>3. Handle errors gracefully</strong></p>



<p class="wp-block-paragraph" id="21ec">When applying a function to a list or vector, use functions like <code>safely()</code>, <code>quietly()</code>, and <code>possibly()</code> to handle errors gracefully without stopping the execution of your code. This approach ensures that your code remains robust and can handle unexpected input values.</p>



<p class="wp-block-paragraph" id="97a5"><strong>4. Know when to use purrr vs. base R or dplyr</strong></p>



<p class="wp-block-paragraph" id="aac3">While <code>purrr</code> provides a powerful and flexible way to manipulate data, there are cases where base R or <code>dplyr</code> functions may be more appropriate or efficient. For example, if you need to perform simple operations on a data frame, consider using <code>dplyr</code> functions like <code>mutate()</code> or <code>summarize()</code>. Evaluate the needs of your specific task and choose the best tool for the job.</p>



<p class="wp-block-paragraph" id="d5c9"><strong>5. Familiarize yourself with the purrr documentation</strong></p>



<p class="wp-block-paragraph" id="57fa">The <code>purrr</code> package has a wealth of functions and features that can help you streamline your code and solve complex problems. Make sure to consult the official documentation (<a href="https://purrr.tidyverse.org/" rel="noreferrer noopener" target="_blank">https://purrr.tidyverse.org/</a>) to explore its full capabilities and discover new techniques.</p>



<p class="wp-block-paragraph" id="eac3">By following these tips and best practices, you can fully leverage the power of the <code>purrr</code> package in your R projects, making your code more efficient, readable, and maintainable. Embrace the functional programming paradigm and use <code>purrr</code> to solve real-world data analysis challenges with ease.</p>



<hr class="wp-block-separator has-alpha-channel-opacity is-style-dots"/>



<h1 class="wp-block-heading" id="c363">Wrapping up</h1>



<p class="wp-block-paragraph" id="6914">Throughout this article, we’ve delved into the capabilities and adaptability of R’s <code>purrr</code> package in the realm of functional programming and data handling. Spanning from foundational functional programming principles to the pivotal role of the map() function suite, all the way to intricate subjects like engaging nested data sets and adept error management.</p>



<p class="wp-block-paragraph" id="75dd">Using real-world scenarios, we’ve showcased how <code>purrr</code> can be instrumental in de-complicating daunting tasks, optimizing your scripts, and enhancing its legibility and sustainability. Incorporating <code>purrr</code> into your R utilities ensures a smoother journey through data manipulation and analytical hurdles.</p>



<p class="wp-block-paragraph" id="78e2">As you venture further into the depths of the <code>purrr</code> package, bear in mind that mastery comes with repetition. Embrace exploration, and endeavor to ingeniously apply <code>purrr</code> functionalities in your endeavors. With perseverance, you’ll cultivate a profound grasp of its intricacies, propelling you towards proficient data management in R.</p>



<p class="wp-block-paragraph" id="15ba">Happy coding!</p>



<p class="wp-block-paragraph" id="b396"><strong>Further Reading and Exploration:</strong></p>



<p class="wp-block-paragraph" id="79d8">For those eager to expand their expertise on <code>purrr</code> and R’s functional programming, consider the following treasure trove of resources:</p>



<ol class="wp-block-list">
<li><code>purrr’s</code> Official Guide: As a logical first step, the <code>purrr</code> package’s official documentation provides a thorough overview of all it offers. Dive into the nuances at <code>purrr’s</code><a href="https://purrr.tidyverse.org/" rel="noreferrer noopener" target="_blank"> official site</a>.</li>



<li>R for Data Science: A masterpiece penned by Hadley Wickham and Garrett Grolemund, this digital tome offers an exhaustive look into R’s role in data science. Notably, it features a segment dedicated to <code>purrr’s</code> prowess in functional programming. Grab your copy <a href="https://r4ds.had.co.nz/" rel="noreferrer noopener" target="_blank">here</a>.</li>



<li>Advanced R: A deeper dive by Hadley Wickham, “Advanced R” ventures into the more intricate aspects of R, shedding light on advanced functional programming paradigms. Embark on this advanced journey <a href="https://adv-r.hadley.nz/" rel="noreferrer noopener" target="_blank">here</a>.</li>



<li>RStudio’s Vibrant Community: Seeking advice, hoping to discuss new findings, or simply aiming to network? The RStudio community is a hub of enthusiasts, experts, and curious minds. Engage with like-minded individuals <a href="https://community.rstudio.com/" rel="noreferrer noopener" target="_blank">right here</a>.</li>
</ol>



<p class="wp-block-paragraph" id="238c">Harnessing these resources and proactively mingling with the wider R circle will undoubtedly refine your prowess with both the <code>purrr</code> package and R’s functional programming realm. Continue your journey of discovery, trial, and collaborative learning to blossom as an adept data scientist and R aficionado.</p>
<p>The post <a href="https://analyticadss.com/unleash-the-power-of-functional-programming-in-r-with-the-purrr-package/">Unleash the Power of Functional Programming in R with the purrr Package</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Introduction to Probability and Statistics: Basic Concepts and Terminology with Visuals — Part I</title>
		<link>https://analyticadss.com/introduction-to-probability-and-statistics-basic-concepts-and-terminology-with-visuals-part-i/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Wed, 22 Mar 2023 21:20:10 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[analytica data science solution]]></category>
		<category><![CDATA[analyticadss]]></category>
		<category><![CDATA[R]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=6126</guid>

					<description><![CDATA[<p>Welcome to the first part of our series, “Demystifying Data Science: A Comprehensive Guide for Beginners.” This series is designed to help aspiring data scientists gain a solid understanding of the fundamental concepts and techniques in the field of data science. We will explore various topics, including probability, statistics, machine learning, and data visualization, with [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/introduction-to-probability-and-statistics-basic-concepts-and-terminology-with-visuals-part-i/">Introduction to Probability and Statistics: Basic Concepts and Terminology with Visuals — Part I</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph" id="01da">Welcome to the first part of our series, “Demystifying Data Science: A Comprehensive Guide for Beginners.” This series is designed to help aspiring data scientists gain a solid understanding of the fundamental concepts and techniques in the field of data science. We will explore various topics, including probability, statistics, machine learning, and data visualization, with a strong emphasis on practical examples and visual aids. In this first installment, we will delve into probability and statistics, covering essential concepts such as probability fundamentals, descriptive statistics, and inferential statistics. Stay tuned for more engaging and informative content in the upcoming parts of this series!</p>



<h2 class="wp-block-heading" id="a72b">Introduction</h2>



<p class="wp-block-paragraph" id="ecb6">Probability and statistics are essential disciplines for data scientists, analysts, and researchers. They provide a solid foundation for understanding, interpreting, and drawing meaningful conclusions from data. As the demand for data-driven insights and decision-making grows across various industries, a strong grasp of these concepts is crucial for anyone seeking a career in data science or looking to enhance their analytical skills.</p>



<p class="wp-block-paragraph" id="21de">In this blog post, we will introduce probability and statistics by exploring their basic concepts and terminology. We will explain the core principles and ideas that underpin these disciplines, using real-world examples and R code to help you visualize and understand these concepts in action. By incorporating visuals, we aim to make the material more engaging and easier to comprehend, allowing you to build a strong foundation for future learning.</p>



<p class="wp-block-paragraph" id="654c">Our journey will begin with probability fundamentals, including definitions, types of probability, and essential rules. We will then move on to descriptive statistics, discussing measures of central tendency, dispersion, and shape. Throughout the blog post, we will use the tidyverse package in R, focusing on ggplot2 for data visualization. This popular package offers a powerful and flexible way to create high-quality graphics, aiding in data exploration and communication.</p>



<p class="wp-block-paragraph" id="926c">By the end of this post, you will have a solid understanding of basic probability and descriptive statistics, supported by clear visualizations. This foundational knowledge will prepare you for more advanced topics and techniques in data analysis and machine learning, setting you on a path to success in the ever-evolving world of data science.</p>



<p class="wp-block-paragraph" id="eb8f">Stay tuned as we dive into the fascinating realm of probability and statistics, providing you with practical examples and insights to enhance your understanding and skills.</p>



<h2 class="wp-block-heading" id="2e47">Probability Fundamentals</h2>



<p class="wp-block-paragraph" id="8ec0">Probability is the study of randomness and uncertainty. It provides a way to quantify the likelihood of specific outcomes or events occurring in various situations. Understanding probability fundamentals is crucial for data science, as it underlies many statistical techniques and machine learning algorithms. In this section, we will elaborate on the basic concepts, principles, and rules of probability, using real-world data when possible.</p>



<p class="wp-block-paragraph" id="0f87"><strong>I. Definitions:</strong></p>



<ul class="wp-block-list">
<li>Experiment: An action or procedure that results in one of several possible outcomes. For example, rolling a die is an experiment with six possible outcomes (1, 2, 3, 4, 5, or 6).</li>



<li>Outcome: The result of an experiment. In the die-rolling example, if the die lands on 3, then the outcome is 3.</li>



<li>Event: A set of one or more outcomes. In the die-rolling example, an event could be the die showing an even number, which includes the outcomes {2, 4, 6}.</li>



<li>Sample Space: The set of all possible outcomes of an experiment. For the die-rolling example, the sample space is {1, 2, 3, 4, 5, 6}.</li>
</ul>



<p class="wp-block-paragraph" id="4431"><strong>II. Types of Probability:</strong></p>



<ul class="wp-block-list">
<li><strong>Classical</strong>: Based on the assumption that all outcomes are equally likely. In the die-rolling example, the classical probability of getting an even number is 1/2 (3 even numbers out of 6 possible outcomes).</li>



<li><strong>Relative Frequency</strong>: Based on the observed frequencies of outcomes in a sample. For instance, suppose we roll a die 100 times and observe 40 even numbers. The relative frequency of getting an even number is 40/100 = 0.4.</li>



<li><strong>Subjective</strong>: Based on an individual’s personal judgment or belief. A person may believe that it is more likely to rain tomorrow based on their interpretation of weather patterns and past experiences, even if objective data suggests otherwise.</li>
</ul>



<p class="wp-block-paragraph" id="7da8">Let’s use a real-world dataset to illustrate the relative frequency approach. We will analyze the number of cylinders in vehicles from the <code>mtcars</code> dataset, which is included in R. We will calculate the relative frequency of vehicles with 4, 6, and 8 cylinders.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.7083740234375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="library(knitr)
data(mtcars)

cylinder_counts <- mtcars %>% count(cyl)
total_cars <- sum(cylinder_counts$n)
relative_frequencies <- cylinder_counts %>% 
mutate(relative_frequency = n / total_cars)

kable(relative_frequencies, caption = &quot;Relative Frequencies of Cylinder Counts in Vehicles&quot;)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(knitr)</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mtcars)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">cylinder_counts </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> mtcars </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> count(cyl)</span></span>
<span class="line"><span style="color: #F8F8F2">total_cars </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">sum</span><span style="color: #F8F8F2">(cylinder_counts</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">n)</span></span>
<span class="line"><span style="color: #F8F8F2">relative_frequencies </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> cylinder_counts </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> </span></span>
<span class="line"><span style="color: #F8F8F2">mutate(</span><span style="color: #FD971F; font-style: italic">relative_frequency</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> n </span><span style="color: #F92672">/</span><span style="color: #F8F8F2"> total_cars)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">kable(relative_frequencies, </span><span style="color: #FD971F; font-style: italic">caption</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Relative Frequencies of Cylinder Counts in Vehicles&quot;</span><span style="color: #F8F8F2">)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="1024" height="218" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2023/03/1_440_Wk8vvm9A0kbPPg8qcw-1024x218.webp" alt="" class="wp-image-6127" srcset="https://analyticadss.com/wp-content/uploads/2023/03/1_440_Wk8vvm9A0kbPPg8qcw-1024x218.webp 1024w, https://analyticadss.com/wp-content/uploads/2023/03/1_440_Wk8vvm9A0kbPPg8qcw-500x106.webp 500w, https://analyticadss.com/wp-content/uploads/2023/03/1_440_Wk8vvm9A0kbPPg8qcw-150x32.webp 150w, https://analyticadss.com/wp-content/uploads/2023/03/1_440_Wk8vvm9A0kbPPg8qcw-768x163.webp 768w, https://analyticadss.com/wp-content/uploads/2023/03/1_440_Wk8vvm9A0kbPPg8qcw.webp 1400w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /><figcaption class="wp-element-caption">Relative Frequencies of Cylinder Counts in Vehicles</figcaption></figure>
</div>


<p class="wp-block-paragraph" id="ee7d">In the table above, the “cyl” column represents the number of cylinders in a vehicle, the “n” column shows the count of vehicles with the corresponding number of cylinders, and the “relative_frequency” column displays the relative frequency of each cylinder count in the dataset.</p>



<p class="wp-block-paragraph" id="88cb">Now, let’s visualize the relative frequencies using ggplot2.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.6944427490234375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="library(tidyverse)

ggplot(relative_frequencies, aes(x = factor(cyl), y = relative_frequency)) +
  geom_col(fill = &quot;steelblue&quot;) +
  labs(title = &quot;Relative Frequency of Cylinder Counts in Vehicles&quot;,
       x = &quot;Number of Cylinders&quot;,
       y = &quot;Relative Frequency&quot;) +
  theme_minimal()" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyverse)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">ggplot(relative_frequencies, aes(</span><span style="color: #FD971F; font-style: italic">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">factor</span><span style="color: #F8F8F2">(cyl), </span><span style="color: #FD971F; font-style: italic">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> relative_frequency)) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  geom_col(</span><span style="color: #FD971F; font-style: italic">fill</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;steelblue&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  labs(</span><span style="color: #FD971F; font-style: italic">title</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Relative Frequency of Cylinder Counts in Vehicles&quot;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">       </span><span style="color: #FD971F; font-style: italic">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Number of Cylinders&quot;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">       </span><span style="color: #FD971F; font-style: italic">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Relative Frequency&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  theme_minimal()</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="496" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2023/03/1_I7NZ2TsdPjhhQIZz1bNO8Q.webp" alt="" class="wp-image-6128" srcset="https://analyticadss.com/wp-content/uploads/2023/03/1_I7NZ2TsdPjhhQIZz1bNO8Q.webp 828w, https://analyticadss.com/wp-content/uploads/2023/03/1_I7NZ2TsdPjhhQIZz1bNO8Q-500x300.webp 500w, https://analyticadss.com/wp-content/uploads/2023/03/1_I7NZ2TsdPjhhQIZz1bNO8Q-150x90.webp 150w, https://analyticadss.com/wp-content/uploads/2023/03/1_I7NZ2TsdPjhhQIZz1bNO8Q-768x460.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="f3cf">Another way to demonstrate the relative frequency approach is to simulate a die experiment, we will roll a fair six-sided die 1000 times and calculate the relative frequency of each outcome (1, 2, 3, 4, 5, and 6). Additionally, we will visualize the results using ggplot2.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.708335876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="set.seed(42) # Set seed for reproducibility
n_rolls <- 1000
die_rolls <- sample(1:6, size = n_rolls, replace = TRUE)

die_rolls_df <- data.frame(outcome = die_rolls) %>%
  count(outcome) %>%
  mutate(relative_frequency = n / n_rolls)

die_rolls_df" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">set.seed</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">42</span><span style="color: #F8F8F2">) </span><span style="color: #88846F"># Set seed for reproducibility</span></span>
<span class="line"><span style="color: #F8F8F2">n_rolls </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1000</span></span>
<span class="line"><span style="color: #F8F8F2">die_rolls </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">sample</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F92672">:</span><span style="color: #AE81FF">6</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F; font-style: italic">size</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> n_rolls, </span><span style="color: #FD971F; font-style: italic">replace</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">die_rolls_df </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">data.frame</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F; font-style: italic">outcome</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> die_rolls) </span><span style="color: #F92672">%>%</span></span>
<span class="line"><span style="color: #F8F8F2">  count(outcome) </span><span style="color: #F92672">%>%</span></span>
<span class="line"><span style="color: #F8F8F2">  mutate(</span><span style="color: #FD971F; font-style: italic">relative_frequency</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> n </span><span style="color: #F92672">/</span><span style="color: #F8F8F2"> n_rolls)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">die_rolls_df</span></span></code></pre></div>



<p class="wp-block-paragraph" id="8671">This will generate a data frame with the outcome, count, and relative frequency of each die roll:</p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="301" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2023/03/1_6cqlVMLGN0pwlyuyCIZKug.webp" alt="" class="wp-image-6129" srcset="https://analyticadss.com/wp-content/uploads/2023/03/1_6cqlVMLGN0pwlyuyCIZKug.webp 828w, https://analyticadss.com/wp-content/uploads/2023/03/1_6cqlVMLGN0pwlyuyCIZKug-500x182.webp 500w, https://analyticadss.com/wp-content/uploads/2023/03/1_6cqlVMLGN0pwlyuyCIZKug-150x55.webp 150w, https://analyticadss.com/wp-content/uploads/2023/03/1_6cqlVMLGN0pwlyuyCIZKug-768x279.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="d448">Now, let’s create a bar chart to visualize the relative frequencies:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.6944427490234375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="ggplot(die_rolls_df, aes(x = factor(outcome), y = relative_frequency)) +
  geom_col(fill = &quot;steelblue&quot;) +
  labs(title = &quot;Die Roll Simulation&quot;,
       x = &quot;Outcome&quot;,
       y = &quot;Relative Frequency&quot;) +
  theme_minimal()" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki monokai" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">ggplot(die_rolls_df, aes(</span><span style="color: #FD971F; font-style: italic">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">factor</span><span style="color: #F8F8F2">(outcome), </span><span style="color: #FD971F; font-style: italic">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> relative_frequency)) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  geom_col(</span><span style="color: #FD971F; font-style: italic">fill</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;steelblue&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  labs(</span><span style="color: #FD971F; font-style: italic">title</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Die Roll Simulation&quot;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">       </span><span style="color: #FD971F; font-style: italic">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Outcome&quot;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">       </span><span style="color: #FD971F; font-style: italic">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Relative Frequency&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  theme_minimal()</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="488" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2023/03/1_qLn-3a0Vi2UpTqWf5CQOeg.webp" alt="" class="wp-image-6130" srcset="https://analyticadss.com/wp-content/uploads/2023/03/1_qLn-3a0Vi2UpTqWf5CQOeg.webp 828w, https://analyticadss.com/wp-content/uploads/2023/03/1_qLn-3a0Vi2UpTqWf5CQOeg-500x295.webp 500w, https://analyticadss.com/wp-content/uploads/2023/03/1_qLn-3a0Vi2UpTqWf5CQOeg-150x88.webp 150w, https://analyticadss.com/wp-content/uploads/2023/03/1_qLn-3a0Vi2UpTqWf5CQOeg-768x453.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="f84b">The resulting bar chart displays the relative frequencies of each outcome from the die roll simulation. The chart illustrates the concept of relative frequency by showing the proportion of each outcome observed in the experiment.</p>



<p class="wp-block-paragraph" id="5529"><strong>III. Probability Rules:</strong></p>



<p class="wp-block-paragraph" id="e648"><strong>Addition Rule</strong>: The addition rule helps us calculate the probability of either event A or event B (or both) occurring. The rule is defined as:</p>



<p class="wp-block-paragraph" id="4952">P(A ∪ B) = P(A) + P(B) — P(A ∩ B)</p>



<p class="wp-block-paragraph" id="9525">Here, P(A ∪ B) represents the probability of event A or event B occurring, while P(A ∩ B) denotes the probability of both events A and B happening together.</p>



<p class="wp-block-paragraph" id="c9a1">Example: Suppose we have a deck of 52 playing cards. What is the probability of drawing either a red card (hearts or diamonds) or a queen?</p>



<p class="wp-block-paragraph" id="85c6">There are 26 red cards and 4 queens in the deck, but 2 of the queens are also red cards (queen of hearts and queen of diamonds). So, applying the addition rule:</p>



<p class="wp-block-paragraph" id="6ec3">P(Red ∪ Queen) = P(Red) + P(Queen) — P(Red ∩ Queen)<br>P(Red ∪ Queen) = (26/52) + (4/52) — (2/52) = 28/52 ≈ 0.5385</p>



<p class="wp-block-paragraph" id="257a"><strong>Multiplication Rule</strong>: The multiplication rule helps us determine the probability of both events A and B occurring simultaneously. The rule is defined as:</p>



<p class="wp-block-paragraph" id="8dea">P(A ∩ B) = P(A|B) * P(B)</p>



<p class="wp-block-paragraph" id="dd18">Here, P(A|B) represents the probability of event A occurring given that event B has occurred.</p>



<p class="wp-block-paragraph" id="f14b">Example: Consider a bag containing 5 blue and 3 red balls. We draw two balls from the bag without replacement. What is the probability of drawing a blue ball first, followed by a red ball?</p>



<p class="wp-block-paragraph" id="d5d1">To apply the multiplication rule, we first calculate the probability of each event:</p>



<p class="wp-block-paragraph" id="80b0">P(Blue1) = 5/8 P(Red2|Blue1) = 3/7</p>



<p class="wp-block-paragraph" id="3d0a">Now, we can compute the probability of both events happening together:</p>



<p class="wp-block-paragraph" id="21c4">P(Blue1 ∩ Red2) = P(Blue1) * P(Red2|Blue1) = (5/8) * (3/7) ≈ 0.2679</p>



<p class="wp-block-paragraph" id="8764">In the next part of this series, we will cover topics related to Descriptive Statistics.</p>



<p class="wp-block-paragraph" id="e9c0">For more practical tips and insights on AI, data science, and statistics, explore our blog at <a href="https://analyticadss.com/blog/" rel="noreferrer noopener" target="_blank">Analytica Data Science Solutions</a>. Discover engaging content to expand your knowledge and stay up-to-date with the latest developments.</p>



<p class="wp-block-paragraph" id="a9b4">In this blog post, we have covered the basics of probability and statistics. If you wish to further expand your knowledge and understanding, here are some references to help you dive deeper into these topics:</p>



<p class="wp-block-paragraph" id="f5eb">Books:</p>



<ol class="wp-block-list">
<li>DeGroot, M. H., & Schervish, M. J. (2012). Probability and Statistics (4th ed.). Pearson.</li>



<li>Wackerly, D., Mendenhall, W., & Scheaffer, R. L. (2007). Mathematical Statistics with Applications (7th ed.). Cengage Learning.</li>



<li>Casella, G., & Berger, R. L. (2001). Statistical Inference (2nd ed.). Duxbury Press.</li>



<li>Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2nd ed.). Springer.</li>



<li>Wickham, H., & Grolemund, G. (2016). R for Data Science: Import, Tidy, Transform, Visualize, and Model Data. O’Reilly Media.</li>
</ol>



<p class="wp-block-paragraph" id="ca60">Websites:</p>



<ol class="wp-block-list">
<li>Khan Academy — Probability and Statistics: <a href="https://www.khanacademy.org/math/statistics-probability" rel="noreferrer noopener" target="_blank">https://www.khanacademy.org/math/statistics-probability</a></li>



<li>Stat Trek — Teach yourself statistics: <a href="https://stattrek.com/" rel="noreferrer noopener" target="_blank">https://stattrek.com/</a></li>



<li>Carnegie Mellon University Probability & Statistics:<a href="https://oli.cmu.edu/courses/probability-statistics-open-free/" rel="noreferrer noopener" target="_blank"> https://oli.cmu.edu/courses/probability-statistics-open-free/</a></li>
</ol>



<p class="wp-block-paragraph" id="48bc">Online Courses:</p>



<ol class="wp-block-list">
<li>Coursera — Statistics with R Specialization by Duke University: <a href="https://www.coursera.org/specializations/statistics" rel="noreferrer noopener" target="_blank">https://www.coursera.org/specializations/statistics</a></li>



<li>Coursera — Introduction to Probability and Data with R by Duke University: <a href="https://www.coursera.org/learn/probability-intro" rel="noreferrer noopener" target="_blank">https://www.coursera.org/learn/probability-intro</a></li>



<li>edX — Probability and Statistics in Data Science using Python by the University of California, San Diego: <a href="https://www.edx.org/course/probability-and-statistics-in-data-science-using-python" rel="noreferrer noopener" target="_blank">https://www.edx.org/course/probability-and-statistics-in-data-science-using-python</a></li>



<li>DataCamp — Introduction to Probability in R: <a href="https://www.datacamp.com/courses/introduction-to-probability-in-r" rel="noreferrer noopener" target="_blank">https://www.datacamp.com/courses/introduction-to-probability-in-r</a></li>



<li>DataCamp — Foundations of Probability in R: <a href="https://www.datacamp.com/courses/foundations-of-probability-in-r" rel="noreferrer noopener" target="_blank">https://www.datacamp.com/courses/foundations-of-probability-in-r</a></li>
</ol>



<p class="wp-block-paragraph">Read More blogs in AnalyticaDSS Blogs here : <a href="https://analyticadss.com/blog">BLOGS</a></p>



<p class="wp-block-paragraph">Read More blogs in Medium : <a href="https://medium.com/@aousabdo">Medium Blogs</a></p>
<p>The post <a href="https://analyticadss.com/introduction-to-probability-and-statistics-basic-concepts-and-terminology-with-visuals-part-i/">Introduction to Probability and Statistics: Basic Concepts and Terminology with Visuals — Part I</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>R on Steroids with Rcpp Library!</title>
		<link>https://analyticadss.com/r-on-steroids-with-rcpp-library/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Tue, 03 Jan 2023 20:00:59 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Future]]></category>
		<category><![CDATA[Future Technology]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=5836</guid>

					<description><![CDATA[<p>Introduction R is a popular programming language for data analysis and statistical computing. It is widely used in a variety of fields, including finance, healthcare, and research, and is known for its powerful tools for data manipulation, visualization, and statistical analysis. Despite its many strengths, R can sometimes be slow, especially when performing computationally intensive [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/r-on-steroids-with-rcpp-library/">R on Steroids with Rcpp Library!</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading" id="e742">Introduction</h2>



<p class="wp-block-paragraph" id="c1e4"><strong>R </strong>is a popular programming language for data analysis and statistical computing. It is widely used in a variety of fields, including finance, healthcare, and research, and is known for its powerful tools for data manipulation, visualization, and statistical analysis.</p>



<p class="wp-block-paragraph" id="3953">Despite its many strengths, R can sometimes be slow, especially when performing computationally intensive tasks or working with large datasets. This can be a problem for users who need to analyze data quickly or who are working on time-sensitive projects. To address this issue, there is a need for tools and techniques that can help to speed up R code.</p>



<p class="wp-block-paragraph" id="fc46">One way to speed up R code is to use Rcpp, a package for R that allows you to easily integrate C++ code into R. By using Rcpp, you can take advantage of the speed and efficiency of C++ to make your R code run faster. In this article, we will explore the benefits of using Rcpp and how it can help you to speed up your R code.</p>



<h2 class="wp-block-heading" id="c1d5">Why Rcpp is faster than R</h2>



<p class="wp-block-paragraph" id="10a8">One of the main reasons why <strong>Rcpp </strong>can be faster than R is that it allows you to write code in C++, which is a compiled language. This means that the code is transformed into machine code before it is executed, which can be much faster than interpreted languages like R.</p>



<p class="wp-block-paragraph" id="db71">In contrast, interpreted languages like R are executed directly by the interpreter, without the need for pre-compilation. While this can make them easier to use and more flexible, it can also make them slower, as the interpreter has to parse and execute the code on the fly.</p>



<p class="wp-block-paragraph" id="1199">Rcpp makes it easy to write C++ code that can be called from R. When you use Rcpp, your C++ code is compiled into a shared library, which can then be loaded and called from R. This allows you to take advantage of the speed and efficiency of C++ while still using R for your overall workflow.</p>



<p class="wp-block-paragraph" id="47dd">There are many types of tasks that can benefit from using Rcpp. In general, tasks that involve heavy computation or looping can be particularly well-suited for Rcpp, as these types of tasks can be very slow in R. Examples of tasks that might benefit from using Rcpp include machine learning algorithms, simulations, and data manipulation.</p>



<h2 class="wp-block-heading" id="5c5d">Getting started with Rcpp</h2>



<p class="wp-block-paragraph" id="bf5e">To use Rcpp, you will first need to install it. You can do this by running the following command in your R session:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="install.packages(&quot;Rcpp&quot;)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">install.packages</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;Rcpp&quot;</span><span style="color: #F8F8F2">)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="3875">Once Rcpp is installed, you can load it into your R session using the <code>library</code> function:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="library(Rcpp)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(Rcpp)</span></span></code></pre></div>



<p class="wp-block-paragraph">An Rcpp function is a C++ function that can be called from R. It has a specific structure that includes a list of input arguments and a return value. Here is an example of a simple Rcpp function that takes two integers as input and returns their sum:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70486307144165px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
int sum(int x, int y) {
  return x + y;
}" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F">#include <Rcpp.h></span></span>
<span class="line"><span style="color: #F8F8F2">using namespace Rcpp;</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F92672">//</span><span style="color: #F8F8F2"> [[Rcpp</span><span style="color: #F92672">::</span><span style="color: #F8F8F2">export]]</span></span>
<span class="line"><span style="color: #F8F8F2">int </span><span style="color: #66D9EF">sum</span><span style="color: #F8F8F2">(int x, int y) {</span></span>
<span class="line"><span style="color: #F8F8F2">  return x </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> y;</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span></code></pre></div>



<p class="wp-block-paragraph" id="d6a4">The <code>#include <Rcpp.h></code> line includes the Rcpp header file, which provides access to various Rcpp functions and types. The <code>using namespace Rcpp;</code> line allows you to use Rcpp functions and types without having to prefix them with <code>Rcpp::</code>. The <code>[[Rcpp::export]]</code> attribute tells Rcpp to make the function available to R.</p>



<p class="wp-block-paragraph" id="9f2e">Here is an example of a more complete Rcpp function that demonstrates how to pass variables between R and C++ and return a result to R:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.39583444595337px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="#include <Rcpp.h>
using namespace Rcpp;

// [[Rcpp::export]]
NumericVector matrixMultiply(NumericMatrix A, NumericVector x) {
  int nrow = A.nrow(), ncol = A.ncol();

  // Check that the dimensions of A and x are compatible
  if (ncol != x.size()) {
    stop(&quot;Incompatible dimensions: cannot multiply matrix and vector.&quot;);
  }

  // Create the result vector
  NumericVector y(nrow);

  // Perform the matrix-vector multiplication
  for (int i = 0; i < nrow; i++) {
    double sum = 0;
    for (int j = 0; j < ncol; j++) {
      sum += A(i, j) * x[j];
    }
    y[i] = sum;
  }

  return y;
}" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F">#include <Rcpp.h></span></span>
<span class="line"><span style="color: #F8F8F2">using namespace Rcpp;</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F92672">//</span><span style="color: #F8F8F2"> [[Rcpp</span><span style="color: #F92672">::</span><span style="color: #F8F8F2">export]]</span></span>
<span class="line"><span style="color: #F8F8F2">NumericVector matrixMultiply(NumericMatrix A, NumericVector x) {</span></span>
<span class="line"><span style="color: #F8F8F2">  int </span><span style="color: #FD971F">nrow</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> A.nrow(), </span><span style="color: #FD971F">ncol</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> A.ncol();</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">//</span><span style="color: #F8F8F2"> Check that the dimensions of A and x are compatible</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">if</span><span style="color: #F8F8F2"> (ncol </span><span style="color: #F92672">!=</span><span style="color: #F8F8F2"> x.size()) {</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #66D9EF">stop</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;Incompatible dimensions: cannot multiply matrix and vector.&quot;</span><span style="color: #F8F8F2">);</span></span>
<span class="line"><span style="color: #F8F8F2">  }</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">//</span><span style="color: #F8F8F2"> Create the result vector</span></span>
<span class="line"><span style="color: #F8F8F2">  NumericVector y(nrow);</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">//</span><span style="color: #F8F8F2"> Perform the matrix</span><span style="color: #F92672">-</span><span style="color: #F8F8F2">vector multiplication</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">for</span><span style="color: #F8F8F2"> (int </span><span style="color: #FD971F">i</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">; i </span><span style="color: #F92672"><</span><span style="color: #F8F8F2"> nrow; i</span><span style="color: #F92672">++</span><span style="color: #F8F8F2">) {</span></span>
<span class="line"><span style="color: #F8F8F2">    double </span><span style="color: #FD971F">sum</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">;</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #F92672">for</span><span style="color: #F8F8F2"> (int </span><span style="color: #FD971F">j</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">; j </span><span style="color: #F92672"><</span><span style="color: #F8F8F2"> ncol; j</span><span style="color: #F92672">++</span><span style="color: #F8F8F2">) {</span></span>
<span class="line"><span style="color: #F8F8F2">      sum </span><span style="color: #F92672">+</span><span style="color: #F8F8F2">= A(i, j) </span><span style="color: #F92672">*</span><span style="color: #F8F8F2"> x[j];</span></span>
<span class="line"><span style="color: #F8F8F2">    }</span></span>
<span class="line"><span style="color: #F8F8F2">    y[i] = sum;</span></span>
<span class="line"><span style="color: #F8F8F2">  }</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">  return y;</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span></code></pre></div>



<p class="wp-block-paragraph">This function takes in a matrix <code>A</code> and a vector <code>x</code>, and returns their matrix-vector product as a new vector <code>y</code>. It first checks that the dimensions of <code>A</code> and <code>x</code> are compatible, and then performs the matrix-vector multiplication by looping over the rows of <code>A</code> and summing the products of the corresponding entries. Finally, it returns the result vector <code>y</code> to R.</p>



<h2 class="wp-block-heading" id="1ba7">Tips for optimizing Rcpp code</h2>



<p class="wp-block-paragraph" id="bb75">One of the first steps in optimizing Rcpp code is to identify the bottlenecks in your code, i.e., the parts of the code that are taking the most time to execute. There are a number of tools available for profiling R code, such as the <code>profvis</code> package and the <code>Rprof</code> function. By using these tools, you can get a sense of which parts of your code are taking the most time, and focus your optimization efforts on those areas.</p>



<p class="wp-block-paragraph" id="530d">There are a number of ways to optimize Rcpp code, depending on the specific needs of your project. Here are a few tips:</p>



<ul class="wp-block-list">
<li>Avoid unnecessary copies: When passing data between R and C++, it is often more efficient to pass pointers to the data rather than making copies of the data. Rcpp provides special types and functions for this purpose, such as the <code>NumericMatrix</code> and <code>NumericVector</code> types and the <code>as</code> and <code>wrap</code> functions.</li>



<li>Use Rcpp’s special types and functions: Rcpp provides a number of special types and functions that can make it easier to work with R data from C++. For example, the <code>NumericMatrix</code> and <code>NumericVector</code> types provide convenient ways to access and manipulate matrix and vector data, while the <code>Rcout</code> and <code>Rcerr</code> streams allow you to print to the R console from C++.</li>



<li>Consider using parallelization: If your code can be parallelized, using RcppParallel can be a powerful way to speed up your code. RcppParallel provides a number of tools for writing concurrent C++ code that can be called from R</li>
</ul>



<h2 class="wp-block-heading" id="3987">Real-world examples of using Rcpp</h2>



<p class="wp-block-paragraph" id="63cd">To give you a sense of the types of tasks that can benefit from using Rcpp, here are a few examples of real-world projects that have used Rcpp to speed up their code:</p>



<ul class="wp-block-list">
<li>Machine learning: Rcpp has been used to speed up various machine learning algorithms, such as gradient boosting and k-means clustering. For example, the <code>xgboost</code> package uses Rcpp to provide a fast implementation of the XGBoost algorithm.</li>



<li>Simulations: Rcpp can be very useful for performing complex simulations, as it allows you to take advantage of C++’s speed and efficiency to run many simulations in a short amount of time. For example, the <code>simstudy</code> package uses Rcpp to perform simulations for statistical power calculations.</li>



<li>Data manipulation: Rcpp can be used to perform complex data manipulation tasks, such as reshaping or aggregating data. For example, the <code>data.table</code> package uses Rcpp to provide fast and efficient..</li>
</ul>



<h2 class="wp-block-heading" id="f81c">Conclusion</h2>



<p class="wp-block-paragraph" id="d8f4">In this article, we have explored the benefits of using Rcpp to speed up R code. We have seen that Rcpp allows you to write C++ code that is compiled and called from R, taking advantage of the speed and efficiency of C++ to make your R code run faster. We have also looked at some tips for optimizing Rcpp code and a few examples of real-world projects that have used Rcpp to speed up their code.</p>



<p class="wp-block-paragraph" id="ed25">If you are working on a project that involves heavy computation or looping, or if you simply need to speed up your R code, you may want to consider using Rcpp. While Rcpp can be somewhat more complex to use than pure R code, it can be a powerful tool for improving the performance of your R code.</p>



<p class="wp-block-paragraph" id="7798">To learn more about Rcpp, you may want to check out the following resources:</p>



<ul class="wp-block-list">
<li>The Rcpp documentation: <a href="https://cran.r-project.org/web/packages/Rcpp/index.html" rel="noreferrer noopener" target="_blank">https://cran.r-project.org/web/packages/Rcpp/index.html</a></li>



<li>The Rcpp gallery: <a href="https://rcpp.org/gallery/" rel="noreferrer noopener" target="_blank">https://rcpp.org/gallery/</a></li>



<li>Hadley Wickham’s “Advanced R” book: <a href="https://adv-r.hadley.nz/" rel="noreferrer noopener" target="_blank">https://adv-r.hadley.nz/</a></li>
</ul>



<p class="wp-block-paragraph">Read More blogs in AnalyticaDSS Blogs here : <a href="https://analyticadss.com/blog">BLOGS</a></p>



<p class="wp-block-paragraph">Read More blogs in Medium : <a href="https://medium.com/@aousabdo">Medium Blogs</a></p>
<p>The post <a href="https://analyticadss.com/r-on-steroids-with-rcpp-library/">R on Steroids with Rcpp Library!</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Unsolved Math Problems: The Goldbach Conjecture!</title>
		<link>https://analyticadss.com/unsolved-math-problems-the-goldbach-conjecture/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Fri, 06 Aug 2021 15:11:58 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Mathematics]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[Mathematics Education]]></category>
		<category><![CDATA[Numerical Methods]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=4818</guid>

					<description><![CDATA[<p>“Unsolved Math Problems: The Goldbach Conjecture!” There are still many unsolved problems in Mathematics, despite countless research trying to solve these problems. Our Math problem for today is about the Goldbach Conjecture. In a previous post, I talked about the Collatz Conjecture, which is one of my favorite unsolved problems in Mathematics. The Goldbach conjecture is [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/unsolved-math-problems-the-goldbach-conjecture/">Unsolved Math Problems: The Goldbach Conjecture!</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">“Unsolved Math Problems: The Goldbach Conjecture!”</p>



<p class="wp-block-paragraph" id="c90c">There are still many unsolved problems in Mathematics, despite countless research trying to solve these problems. Our Math problem for today is about the Goldbach Conjecture. In a previous <a href="https://medium.com/@aousabdo/the-collatz-conjecture-611b65486f90">post</a>, I talked about the Collatz Conjecture, which is one of my favorite unsolved problems in Mathematics.</p>



<p class="wp-block-paragraph" id="8ae4">The Goldbach conjecture is a famous open problem in mathematics that states that every even integer greater than 2 can be expressed as the sum of two prime numbers. This conjecture has been verified for very large numbers, but a complete proof or disproof has eluded mathematicians for over three centuries.</p>



<h4 class="wp-block-heading">The conjecture was proposed by ?</h4>



<p class="wp-block-paragraph" id="8f41">The conjecture was first proposed by Christian Goldbach, a Prussian mathematician, in a letter to his colleague Leonhard Euler in 1742. Goldbach’s conjecture states that every even integer greater than 2 can be written as the sum of two prime numbers. For example, 4 can be written as the sum of 2 and 2, 6 can be written as the sum of 3 and 3, and 8 can be written as the sum of 5 and 3.</p>



<p class="wp-block-paragraph" id="a6dd">Despite its simplicity, the Goldbach conjecture has proven to be a difficult problem to solve. Over the years, many mathematicians have attempted to prove or disprove the conjecture, but to date, no one has been able to come up with a complete proof.</p>



<p class="wp-block-paragraph" id="fdb1"><strong>One of the reasons</strong> the Goldbach conjecture is so difficult to prove is that it involves the concept of prime numbers, which are numbers that are divisible only by themselves and 1. Prime numbers play a crucial role in mathematics and are often used to prove other mathematical results, but they are also notoriously difficult to work with.</p>



<h4 class="wp-block-heading">Another reason:</h4>



<p class="wp-block-paragraph" id="871c">Another reason the Goldbach conjecture is difficult to prove is that it involves the concept of infinity. The conjecture states that every even integer greater than 2 can be written as the sum of two prime numbers, which means that the conjecture applies to an infinite number of even integers. Proving an infinite number of statements can be challenging, as it requires a different approach than proving a finite number of statements.</p>



<p class="wp-block-paragraph" id="f327">Despite the difficulty of the problem, many mathematicians have attempted to prove the Goldbach conjecture over the years. In the 19th century, mathematician Bernhard Riemann made significant progress towards a proof by developing a new mathematical tool called the zeta function. This function allows mathematicians to study the distribution of prime numbers and has been used to make significant progress on many other open problems in mathematics.</p>



<h4 class="wp-block-heading">Over the years</h4>



<p class="wp-block-paragraph" id="0180">Over the years, mathematicians have used computers to verify the Goldbach conjecture for very large numbers, and the largest number for which the conjecture has been verified is currently around 4 x 10¹⁸, or 40 quintillion. This verification was performed by a team of mathematicians led by Michael O. Rabin in the 1980s.</p>



<p class="wp-block-paragraph" id="7d5f">However, it is important to note that verifying the conjecture for a very large number does not constitute a complete proof of the conjecture. In order to prove the conjecture, it would be necessary to find a general proof that applies to all even integers, not just a specific set of very large numbers.</p>



<p class="wp-block-paragraph" id="b3b4">Despite the challenges, many mathematicians continue to work on the Goldbach conjecture, as it remains one of the most famous open problems in mathematics. The conjecture has inspired much research and has led to the development of new mathematical techniques and tools, which have in turn been used to make progress on other open problems in mathematics.</p>



<h3 class="wp-block-heading">In Conclusion,</h3>



<p class="wp-block-paragraph" id="cfb9">the Goldbach conjecture is a famous open problem in mathematics that has eluded a complete proof or disproof for over three centuries. Despite the difficulty of the problem, many mathematicians continue to work on it, as it remains an important and fascinating area of study.</p>



<p class="wp-block-paragraph">Read More blogs in AnalyticaDSS Blogs here : <a href="https://analyticadss.com/blog">BLOGS</a></p>



<p class="wp-block-paragraph">Read More blogs in Medium : <a href="https://medium.com/@aousabdo">Medium Blogs</a></p>



<p class="wp-block-paragraph">Read More blogs in R-bloggers : <a href="https://www.r-bloggers.com/">https://www.r-bloggers.com</a></p>
<p>The post <a href="https://analyticadss.com/unsolved-math-problems-the-goldbach-conjecture/">Unsolved Math Problems: The Goldbach Conjecture!</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>The Tidyverse and data.table R Packages</title>
		<link>https://analyticadss.com/the-tidyverse-and-data-table-r-packages/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Sun, 14 Feb 2021 15:21:31 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Tidyverse]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=4821</guid>

					<description><![CDATA[<p>“The Tidyverse and data.table R Packages” The power of R comes from the vast collection of software libraries, i.e. packages, that can be easily installed and loaded in R. Today we will cover two of the most powerful packages in R, the tidyverse and data.table packages. The tidyverse and data.table are two popular packages in R that provide functions for working with data. [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/the-tidyverse-and-data-table-r-packages/">The Tidyverse and data.table R Packages</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">“The Tidyverse and data.table R Packages”</p>



<p class="wp-block-paragraph" id="73a3">The power of R comes from the vast collection of software libraries, i.e. packages, that can be easily installed and loaded in R. Today we will cover two of the most powerful packages in R, the <strong><code>tidyverse</code> </strong>and <code><strong>data.table</strong></code> packages.</p>



<p class="wp-block-paragraph" id="12a6">The <strong><code>tidyverse</code> </strong>and <strong><code>data.table</code> </strong>are two popular packages in R that provide functions for working with data. They both have their own strengths and are suitable for different types of tasks.</p>



<p class="wp-block-paragraph" id="df56">The <strong><code>tidyverse</code> </strong>is a collection of packages designed for data manipulation, visualization, and modeling. It is based on the principles of tidy data, which suggests that data should be structured in a way that makes it easy to work with. The <strong><code>tidyverse</code> </strong>includes packages such as <code><strong>dplyr</strong></code>, <code><strong>tidyr</strong></code>, and <code>ggplot2</code>, which provides functions for data manipulation, cleaning, and visualization.</p>



<p class="wp-block-paragraph" id="1c2b">One of the main advantages of the <strong><code>tidyverse</code> </strong>is its simplicity. The functions in the <strong><code>tidyverse</code> </strong>are easy to learn and use, and they often require fewer lines of code compared to other packages. They also have a consistent syntax, which makes it easier to learn and use multiple functions.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading" id="c2b4">Examples: Tidyverse Examples</h2>



<p class="wp-block-paragraph" id="2da7">Here are some examples of how to use the <code><strong>tidyverse</strong></code>:</p>



<p class="wp-block-paragraph" id="2df8">To select specific columns from a dataset:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the tidyverse package
library(tidyverse)

# Load the mpg dataset from the ggplot2 package
data(mpg)

# Select the &quot;manufacturer&quot; and &quot;model&quot; columns
mpg %>% select(manufacturer, model)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the tidyverse package</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyverse)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mpg dataset from the ggplot2 package</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mpg)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Select the &quot;manufacturer&quot; and &quot;model&quot; columns</span></span>
<span class="line"><span style="color: #F8F8F2">mpg </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> select(manufacturer, model)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="8924">And to group and summarize a dataset:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the tidyverse package
library(tidyverse)

# Load the mpg dataset from the ggplot2 package
data(mpg)

# Group the dataset by &quot;class&quot; and compute the mean of the &quot;hwy&quot; column
mpg %>% group_by(class) %>% summarize(mean_hwy = mean(hwy))" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the tidyverse package</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyverse)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mpg dataset from the ggplot2 package</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mpg)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Group the dataset by &quot;class&quot; and compute the mean of the &quot;hwy&quot; column</span></span>
<span class="line"><span style="color: #F8F8F2">mpg </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> group_by(class) </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> summarize(</span><span style="color: #FD971F">mean_hwy</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">mean</span><span style="color: #F8F8F2">(hwy))</span></span></code></pre></div>



<p class="wp-block-paragraph" id="d645">To join two datasets:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the tidyverse package
library(tidyverse)

# Load the mpg and cylinders datasets from the ggplot2 package
data(mpg)
data(cylinders)

# Join the mpg and cylinders datasets on the &quot;manufacturer&quot; column
mpg %>% left_join(cylinders, by = &quot;manufacturer&quot;)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the tidyverse package</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyverse)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mpg and cylinders datasets from the ggplot2 package</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mpg)</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(cylinders)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Join the mpg and cylinders datasets on the &quot;manufacturer&quot; column</span></span>
<span class="line"><span style="color: #F8F8F2">mpg </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> left_join(cylinders, </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;manufacturer&quot;</span><span style="color: #F8F8F2">)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="68fd">To perform a linear regression using the <code>lm</code> function from the <code>stats</code> package:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395835876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the tidyverse and stats packages
library(tidyverse)
library(stats)

# Load the mtcars dataset
data(mtcars)

# Perform a linear regression to predict mpg (miles per gallon) using wt (weight) as the predictor variable
fit <- mtcars %>% 
  lm(mpg ~ wt, data = .)

# Summarize the model results
summary(fit)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the tidyverse and stats packages</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyverse)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(stats)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mtcars dataset</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mtcars)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Perform a linear regression to predict mpg (miles per gallon) using wt (weight) as the predictor variable</span></span>
<span class="line"><span style="color: #F8F8F2">fit </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> mtcars </span><span style="color: #F92672">%>%</span><span style="color: #F8F8F2"> </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #66D9EF">lm</span><span style="color: #F8F8F2">(mpg </span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> wt, </span><span style="color: #FD971F">data</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> .)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Summarize the model results</span></span>
<span class="line"><span style="color: #66D9EF">summary</span><span style="color: #F8F8F2">(fit)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="9d7e">Create a scatterplot matrix using the <code>scatterplotMatrix</code> function from the <code>car</code> package:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the tidyverse and car packages
library(tidyverse)
library(car)

# Load the iris dataset
data(iris)

# Create a scatterplot matrix of the iris dataset
scatterplotMatrix(iris, smooth = FALSE)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the tidyverse and car packages</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyverse)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(car)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the iris dataset</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(iris)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Create a scatterplot matrix of the iris dataset</span></span>
<span class="line"><span style="color: #F8F8F2">scatterplotMatrix(iris, </span><span style="color: #FD971F">smooth</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">FALSE</span><span style="color: #F8F8F2">)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="9cfb">Create a faceted bar plot using <code><strong>ggplot2</strong></code>:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395843505859375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the tidyverse package
library(tidyverse)

# Load the mpg dataset from the ggplot2 package
data(mpg)

# Create a faceted bar plot showing the distribution of hwy (highway miles per gallon) by class and drv (drive type)
ggplot(mpg, aes(x = hwy)) +
  geom_histogram(binwidth = 2) +
  facet_wrap(~ class + drv, nrow = 2)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the tidyverse package</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyverse)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mpg dataset from the ggplot2 package</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mpg)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Create a faceted bar plot showing the distribution of hwy (highway miles per gallon) by class and drv (drive type)</span></span>
<span class="line"><span style="color: #F8F8F2">ggplot(mpg, aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> hwy)) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  geom_histogram(</span><span style="color: #FD971F">binwidth</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  facet_wrap(</span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> class </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> drv, </span><span style="color: #FD971F">nrow</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">)</span></span></code></pre></div>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading" id="804c">Examples: data.table Examples</h2>



<p class="wp-block-paragraph" id="17bf">The <code><strong>data.table</strong></code> package, on the other hand, is a high-performance package for working with large datasets. It provides functions for manipulating and querying data efficiently. The <code><strong>data.table</strong></code> package is particularly useful when working with datasets that are too large to fit in memory or when you need to perform complex operations on large datasets.</p>



<h4 class="wp-block-heading">One of the main advantages of the <code><strong>data.table</strong></code> package</h4>



<p class="wp-block-paragraph" id="9709">One of the main advantages of the <code><strong>data.table</strong></code> package is its speed. The functions in the <code><strong>data.table</strong></code> package are generally faster than their counterparts in the <code><strong>tidyverse</strong></code>, especially when working with large datasets.</p>



<p class="wp-block-paragraph" id="d980">Here are some more examples of how to use the<strong> <code>data.table</code></strong> package:</p>



<p class="wp-block-paragraph" id="9202">To select specific columns from a dataset:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395843505859375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the data.table package
library(data.table)

# Load the mpg dataset from the ggplot2 package
data(mpg)

# Convert the dataset to a data.table
mpg <- as.data.table(mpg)

# Select the &quot;manufacturer&quot; and &quot;model&quot; columns
mpg[, .(manufacturer, model)]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the data.table package</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(data.table)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mpg dataset from the ggplot2 package</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mpg)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Convert the dataset to a data.table</span></span>
<span class="line"><span style="color: #F8F8F2">mpg </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> as.data.table(mpg)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Select the &quot;manufacturer&quot; and &quot;model&quot; columns</span></span>
<span class="line"><span style="color: #F8F8F2">mpg[, .(manufacturer, model)]</span></span></code></pre></div>



<p class="wp-block-paragraph" id="9768">and to group and summarize a dataset:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395843505859375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the data.table package
library(data.table)

# Load the mpg dataset from the ggplot2 package
data(mpg)

# Convert the dataset to a data.table
mpg <- as.data.table(mpg)

# Group the dataset by &quot;class&quot; and compute the mean of the &quot;hwy&quot; column
mpg[, .(mean_hwy = mean(hwy)), by = class]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the data.table package</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(data.table)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mpg dataset from the ggplot2 package</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mpg)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Convert the dataset to a data.table</span></span>
<span class="line"><span style="color: #F8F8F2">mpg </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> as.data.table(mpg)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Group the dataset by &quot;class&quot; and compute the mean of the &quot;hwy&quot; column</span></span>
<span class="line"><span style="color: #F8F8F2">mpg[, .(</span><span style="color: #FD971F">mean_hwy</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">mean</span><span style="color: #F8F8F2">(hwy)), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> class]</span></span></code></pre></div>



<p class="wp-block-paragraph" id="b569">To join two datasets:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.39581298828125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the data.table package
library(data.table)

# Load the mpg and cylinders datasets from the ggplot2 package
data(mpg)
data(cylinders)

# Convert the datasets to data.tables
mpg <- as.data.table(mpg)
cylinders <- as.data.table(cylinders)

# Join the mpg and cylinders datasets on the &quot;manufacturer&quot; column
mpg[cylinders, on = &quot;manufacturer&quot;]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the data.table package</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(data.table)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mpg and cylinders datasets from the ggplot2 package</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mpg)</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(cylinders)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Convert the datasets to data.tables</span></span>
<span class="line"><span style="color: #F8F8F2">mpg </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> as.data.table(mpg)</span></span>
<span class="line"><span style="color: #F8F8F2">cylinders </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> as.data.table(cylinders)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Join the mpg and cylinders datasets on the &quot;manufacturer&quot; column</span></span>
<span class="line"><span style="color: #F8F8F2">mpg[cylinders, </span><span style="color: #FD971F">on</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;manufacturer&quot;</span><span style="color: #F8F8F2">]</span></span></code></pre></div>



<p class="wp-block-paragraph" id="4fbd">Perform a linear regression using the <code><strong>lm</strong></code><em> </em>function from the <code>stats</code> package and the <code><strong>data.table</strong></code> package:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395835876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the data.table and stats packages
library(data.table)
library(stats)

# Load the mtcars dataset
data(mtcars)

# Convert the dataset to a data.table
mtcars <- setDT(mtcars)

# Perform a linear regression to predict mpg (miles per gallon) using wt (weight) as the predictor variable
fit <- mtcars[, lm(mpg ~ wt)]

# Summarize the model results
summary(fit)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the data.table and stats packages</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(data.table)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(stats)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mtcars dataset</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mtcars)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Convert the dataset to a data.table</span></span>
<span class="line"><span style="color: #F8F8F2">mtcars </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> setDT(mtcars)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Perform a linear regression to predict mpg (miles per gallon) using wt (weight) as the predictor variable</span></span>
<span class="line"><span style="color: #F8F8F2">fit </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> mtcars[, </span><span style="color: #66D9EF">lm</span><span style="color: #F8F8F2">(mpg </span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> wt)]</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Summarize the model results</span></span>
<span class="line"><span style="color: #66D9EF">summary</span><span style="color: #F8F8F2">(fit)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="1bd0">Create a scatterplot matrix using the <code><strong>scatterplotMatrix</strong></code> function from the <strong><code>car</code> </strong>package and the <code><strong>data.table</strong></code> package:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395843505859375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the data.table and car packages
library(data.table)
library(car)

# Load the iris dataset
data(iris)

# Convert the dataset to a data.table
iris <- as.data.table(iris)

# Create a scatterplot matrix of the iris dataset
scatterplotMatrix(iris, smooth = FALSE)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the data.table and car packages</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(data.table)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(car)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the iris dataset</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(iris)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Convert the dataset to a data.table</span></span>
<span class="line"><span style="color: #F8F8F2">iris </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> as.data.table(iris)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Create a scatterplot matrix of the iris dataset</span></span>
<span class="line"><span style="color: #F8F8F2">scatterplotMatrix(iris, </span><span style="color: #FD971F">smooth</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">FALSE</span><span style="color: #F8F8F2">)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="007f">Create a faceted bar plot using <strong><code>ggplot2</code> </strong>and the <code><strong>data.table</strong></code> package:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395843505859375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Load the data.table and ggplot2 packages
library(data.table)
library(ggplot2)

# Load the mpg dataset from the ggplot2 package
data(mpg)

# Convert the dataset to a data.table
mpg <- as.data.table(mpg)

# Create a faceted bar plot showing the distribution of hwy (highway miles per gallon) by class and drv (drive type)
ggplot(mpg, aes(x = hwy)) +
  geom_histogram(binwidth = 2) +
  facet_wrap(~ class + drv, nrow = 2)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Load the data.table and ggplot2 packages</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(data.table)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(ggplot2)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Load the mpg dataset from the ggplot2 package</span></span>
<span class="line"><span style="color: #66D9EF">data</span><span style="color: #F8F8F2">(mpg)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Convert the dataset to a data.table</span></span>
<span class="line"><span style="color: #F8F8F2">mpg </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> as.data.table(mpg)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Create a faceted bar plot showing the distribution of hwy (highway miles per gallon) by class and drv (drive type)</span></span>
<span class="line"><span style="color: #F8F8F2">ggplot(mpg, aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> hwy)) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  geom_histogram(</span><span style="color: #FD971F">binwidth</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  facet_wrap(</span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> class </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> drv, </span><span style="color: #FD971F">nrow</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">)</span></span></code></pre></div>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<p class="wp-block-paragraph" id="f013">In terms of implementation, both the <strong><code>tidyverse</code> </strong>and <code><strong>data.table</strong></code> packages are written in R, but some of the functions in the <code><strong>data.table</strong></code> package are implemented in C for improved performance.</p>



<h2 class="wp-block-heading">In summary</h2>



<p class="wp-block-paragraph" id="4b51">the <code><strong>tidyverse</strong></code> and <code><strong>data.table</strong> </code>are two popular packages in R that provide functions for working with data. The <strong><code>tidyverse</code> </strong>is a collection of packages designed for data manipulation, visualization, and modeling, and it is particularly suitable for tasks that require simplicity and ease of use. The <strong><code>tidyverse</code> </strong>functions are easy to learn and use, and they often require fewer lines of code compared to other packages.</p>



<p class="wp-block-paragraph" id="9f5e">The <code><strong>data.table</strong></code> package is a high-performance package for working with large datasets, and it is particularly useful when working with large datasets or when you need to perform complex operations on large datasets. The functions in the <code><strong>data.table</strong></code> package are generally faster than their counterparts in the, especially when working with large datasets.</p>



<p class="wp-block-paragraph" id="612d">In general, it is a good idea to use the <strong><code>tidyverse</code> </strong>for most tasks, unless you are working with very large datasets or need the extra performance provided by the <code><strong>data.table</strong></code> package.</p>



<h4 class="wp-block-heading">At Analytica</h4>



<p class="wp-block-paragraph" id="4600">and since we deal with larger datasets, GB to TB of data, our preferred tool for data wrangling in R is in fact <code><strong>data.table</strong></code>.</p>



<p class="wp-block-paragraph" id="9cc4">I hope this article helps the reader understand the differences between the <strong><code>tidyverse</code> </strong>and <code><strong>data.table</strong></code> in R, and how to choose the right package for their tasks. Let me know if you have any questions.</p>



<p class="wp-block-paragraph">Read More blogs in AnalyticaDSS Blogs here : <a href="https://analyticadss.com/blog">BLOGS</a></p>



<p class="wp-block-paragraph">Read More blogs in Medium : <a href="https://medium.com/@aousabdo">Medium Blogs</a></p>



<p class="wp-block-paragraph">Read More blogs in R-bloggers : <a href="https://www.r-bloggers.com/">https://www.r-bloggers.com</a></p>
<p>The post <a href="https://analyticadss.com/the-tidyverse-and-data-table-r-packages/">The Tidyverse and data.table R Packages</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Maximizing Efficiency with Loops and Vectorization in Programming Languages</title>
		<link>https://analyticadss.com/maximizing-efficiency-with-loops-and-vectorization-in-programming-languages/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Thu, 24 Dec 2020 09:50:02 +0000</pubDate>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[Programming Languages]]></category>
		<category><![CDATA[R Programming Language]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=4890</guid>

					<description><![CDATA[<p>” Maximizing Efficiency with Loops and Vectorization in Programming “ Table of Content: I. Introduction to loops and vectorization in programming languages II. Loops in programming languages III. Vectorization in programming languages IV. When to use loops vs. vectorization V. Best practices for using loops and vectorization VI. Conclusion I. Introduction to loops and vectorization [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/maximizing-efficiency-with-loops-and-vectorization-in-programming-languages/">Maximizing Efficiency with Loops and Vectorization in Programming Languages</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">” <strong>Maximizing Efficiency with Loops and Vectorization in Programming</strong> “</p>



<h2 class="wp-block-heading" id="ab2f">Table of Content:</h2>



<p class="wp-block-paragraph" id="7cfd"><strong>I. Introduction to loops and vectorization in programming languages</strong></p>



<ul class="wp-block-list">
<li><strong>Definition of loops</strong></li>



<li><strong>Types of loops (for, while, repeat)</strong></li>



<li><strong>Definition of vectorization</strong></li>



<li><strong>Advantages of vectorization over loops</strong></li>
</ul>



<p class="wp-block-paragraph" id="af9e"><strong>II. Loops in programming languages</strong></p>



<ul class="wp-block-list">
<li><strong>How loops work</strong></li>



<li><strong>Examples of loop usage</strong></li>



<li><strong>Common pitfalls of using loops</strong></li>
</ul>



<p class="wp-block-paragraph" id="d9a0"><strong>III. Vectorization in programming languages</strong></p>



<ul class="wp-block-list">
<li><strong>How vectorization works</strong></li>



<li><strong>Examples of vectorized operations</strong></li>



<li><strong>Advantages of vectorization (speed, efficiency)</strong></li>
</ul>



<p class="wp-block-paragraph" id="9387"><strong>IV. When to use loops vs. vectorization</strong></p>



<ul class="wp-block-list">
<li><strong>Situations where loops are necessary</strong></li>



<li><strong>Situations where vectorization is preferred</strong></li>



<li><strong>Trade-offs between loops and vectorization</strong></li>
</ul>



<p class="wp-block-paragraph" id="8e71"><strong>V. Best practices for using loops and vectorization</strong></p>



<ul class="wp-block-list">
<li><strong>Tips for optimizing loop performance</strong></li>



<li><strong>Tips for choosing between loops and vectorization</strong></li>
</ul>



<p class="wp-block-paragraph" id="dabe"><strong>VI. Conclusion</strong></p>



<ul class="wp-block-list">
<li><strong>Summary of key points</strong></li>



<li><strong>Importance of understanding loops and vectorization in programming languages</strong></li>
</ul>



<h2 class="wp-block-heading" id="49de">I. Introduction to loops and vectorization in programming languages</h2>



<p class="wp-block-paragraph" id="b26a">Loops and vectorization are two important concepts in programming languages that refer to different ways of performing the same task. They are used to manipulate data, perform calculations, and achieve the desired outcome. Understanding how to use loops and vectorization effectively can have a significant impact on the efficiency and performance of your code.</p>



<h3 class="wp-block-heading" id="d2ca">Definition of loops</h3>



<p class="wp-block-paragraph" id="613c">A loop is a way to repeat a set of instructions multiple times. In programming languages, there are several types of loops, including <code>for</code> loops, <code>while</code> loops, and <code>repeat</code> loops.</p>



<p class="wp-block-paragraph" id="9bb0">A <code>for</code> loop is used to iterate over a sequence of objects, such as a list or an array. It has the following syntax:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70486307144165px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="for (variable in sequence) {
  statements
}" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F92672">for</span><span style="color: #F8F8F2"> (variable </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> sequence) {</span></span>
<span class="line"><span style="color: #F8F8F2">  statements</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span></code></pre></div>



<p class="wp-block-paragraph">A <code>while</code> loop, on the other hand, continues to execute as long as a certain condition is true. It has the following syntax:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70486307144165px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="while (condition) {
  statements
}" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F92672">while</span><span style="color: #F8F8F2"> (condition) {</span></span>
<span class="line"><span style="color: #F8F8F2">  statements</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span></code></pre></div>



<p class="wp-block-paragraph">Finally, a <code>repeat</code> loop is similar to a <code>while</code> loop, except that it always executes at least once before checking the condition. It has the following syntax:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="repeat {
  statements
} while (condition)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F92672">repeat</span><span style="color: #F8F8F2"> {</span></span>
<span class="line"><span style="color: #F8F8F2">  statements</span></span>
<span class="line"><span style="color: #F8F8F2">} </span><span style="color: #F92672">while</span><span style="color: #F8F8F2"> (condition)</span></span></code></pre></div>



<h2 class="wp-block-heading" id="c1ff">Definition of vectorization</h2>



<p class="wp-block-paragraph" id="64c7">Vectorization is a way to perform operations on multiple elements of a vector simultaneously, rather than using a loop to iterate over each element individually. Vectorized operations are generally faster and more efficient than looping, because they take advantage of the underlying structure of vectors and the optimized routines in the programming language’s base package.</p>



<p class="wp-block-paragraph" id="d631">Here is an example of vectorization in the R statistical programming language:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704856872558594px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Create a vector of numbers
numbers <- c(1, 2, 3, 4, 5)

# Add 1 to each element of the vector using vectorization
numbers <- numbers + 1

# Print the resulting vector
print(numbers)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Create a vector of numbers</span></span>
<span class="line"><span style="color: #F8F8F2">numbers </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Add 1 to each element of the vector using vectorization</span></span>
<span class="line"><span style="color: #F8F8F2">numbers </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> numbers </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Print the resulting vector</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(numbers)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="ea9a">The output of this code will be:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="2 3 4 5 6" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #AE81FF">2</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">6</span></span></code></pre></div>



<p class="wp-block-paragraph" id="ec02">In this example, the <code>+ 1</code> operation is applied to the entire vector <code>numbers</code>, rather than to each element individually. This is an example of vectorization because it takes advantage of the underlying structure of vectors and the optimized routines in R’s base package.</p>



<h2 class="wp-block-heading" id="c3e7">Advantages of vectorization over loops</h2>



<p class="wp-block-paragraph" id="a013">Vectorization has several advantages over loops. First, vectorized operations are generally faster than looping, because they take advantage of optimized routines and the underlying structure of vectors. Second, vectorization is often easier to read and understand than looping, because it uses concise and expressive syntax. Finally, vectorization can improve the maintainability of your code, because it is easier to modify and debug than looping.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading" id="65a0">II. Loops in programming languages</h2>



<p class="wp-block-paragraph" id="18c9">Loops are a fundamental concept in programming languages and are used to repeat a set of instructions multiple times. In this section, we will explore how loops work, provide examples of their usage, and discuss the common pitfalls of using loops.</p>



<h3 class="wp-block-heading" id="33ef">How loops work</h3>



<p class="wp-block-paragraph" id="3d48">Loops work by iterating over a sequence of objects and executing a set of instructions for each iteration. The number of iterations is determined by the length of the sequence or by a specified condition.</p>



<p class="wp-block-paragraph" id="a3d3">In most programming languages, loops are controlled by a looping construct, such as a <code>for</code> loop or a <code>while</code> loop. The looping construct specifies the sequence to be iterated over and the statements to be executed for each iteration.</p>



<h3 class="wp-block-heading" id="df6a">Examples of loop usage</h3>



<p class="wp-block-paragraph" id="9517">Here are some examples of loop usage in different programming languages:</p>



<p class="wp-block-paragraph" id="db03"><strong>R:</strong></p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70486307144165px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Create a vector of numbers
numbers <- c(1, 2, 3, 4, 5)

# Use a for loop to iterate over the vector and print each number
for (i in numbers) {
  print(i)
}" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Create a vector of numbers</span></span>
<span class="line"><span style="color: #F8F8F2">numbers </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Use a for loop to iterate over the vector and print each number</span></span>
<span class="line"><span style="color: #F92672">for</span><span style="color: #F8F8F2"> (i </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> numbers) {</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(i)</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span></code></pre></div>



<p class="wp-block-paragraph" id="b749"><strong>Python:</strong></p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Create a list of numbers
numbers = [1, 2, 3, 4, 5]

# Use a for loop to iterate over the list and print each number
for i in numbers:
  print(i)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Create a list of numbers</span></span>
<span class="line"><span style="color: #F8F8F2">numbers </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> [</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2">]</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Use a for loop to iterate over the list and print each number</span></span>
<span class="line"><span style="color: #F92672">for</span><span style="color: #F8F8F2"> i </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> numbers:</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(i)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="ec46"><strong>Java:</strong></p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70486307144165px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="// Create an array of numbers
int[] numbers = {1, 2, 3, 4, 5};

// Use a for loop to iterate over the array and print each number
for (int i = 0; i < numbers.length; i++) {
  System.out.println(numbers[i]);
}" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F">// Create an array of numbers</span></span>
<span class="line"><span style="color: #66D9EF">int</span><span style="color: #F8F8F2">[] numbers </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> {</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2">};</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F">// Use a for loop to iterate over the array and print each number</span></span>
<span class="line"><span style="color: #F92672">for</span><span style="color: #F8F8F2"> (</span><span style="color: #66D9EF">int</span><span style="color: #F8F8F2"> i </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">; i </span><span style="color: #F92672"><</span><span style="color: #F8F8F2"> numbers.length; i</span><span style="color: #F92672">++</span><span style="color: #F8F8F2">) {</span></span>
<span class="line"><span style="color: #F8F8F2">  System.out.</span><span style="color: #A6E22E">println</span><span style="color: #F8F8F2">(numbers[i]);</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span></code></pre></div>



<h2 class="wp-block-heading" id="f05b">Common pitfalls of using loops</h2>



<p class="wp-block-paragraph" id="7ab4">There are several common pitfalls to be aware of when using loops. One pitfall is the risk of infinite loops, which occur when the looping condition is always true or the loop counter is not updated properly. Infinite loops can cause your program to run indefinitely and can be difficult to debug.</p>



<p class="wp-block-paragraph" id="42b4">Another pitfall is the risk of off-by-one errors, which occur when the loop counter is not properly initialized or the loop condition is not properly defined. Off-by-one errors can cause your loop to either iterate too few or too many times, resulting in incorrect output or unintended behavior.</p>



<p class="wp-block-paragraph" id="82a7">Finally, loops can be slower and less efficient than vectorized operations, particularly for large datasets. This can be a problem if performance is critical for your application.</p>



<p class="wp-block-paragraph" id="4821">In general, it is important to carefully consider the performance and readability of your code when using loops and to choose the appropriate looping construct for your specific needs.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading" id="8fc8">III. Vectorization in programming languages</h2>



<p class="wp-block-paragraph" id="c599">Vectorization is a way to perform operations on multiple elements of a vector simultaneously, rather than using a loop to iterate over each element individually. Vectorized operations are generally faster and more efficient than looping because they take advantage of the underlying structure of vectors and the optimized routines in the programming language’s base package. In this section, we will explore how vectorization works, provide examples of vectorized operations, and discuss the advantages of vectorization.</p>



<h2 class="wp-block-heading" id="f512">How vectorization works</h2>



<p class="wp-block-paragraph" id="f421">Vectorization works by applying an operation to an entire vector at once, rather than to each element individually. Most programming languages have built-in functions or operators that support vectorization, such as element-wise arithmetic operators and functions in R, NumPy, and Python, or the <code>apply</code> family of functions in R.</p>



<p class="wp-block-paragraph" id="5319">Vectorization is typically faster and more efficient than looping because it avoids the overhead of calling a looping construct and iterating over each element individually. It also often results in more readable and expressive code.</p>



<h2 class="wp-block-heading" id="71a2">Examples of vectorized operations</h2>



<p class="wp-block-paragraph" id="861c">Here are some examples of vectorized operations in different programming languages:</p>



<p class="wp-block-paragraph" id="09df"><strong>R:</strong></p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704856872558594px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Create a vector of numbers
numbers <- c(1, 2, 3, 4, 5)

# Add 1 to each element of the vector using vectorization
numbers <- numbers + 1

# Print the resulting vector
print(numbers)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Create a vector of numbers</span></span>
<span class="line"><span style="color: #F8F8F2">numbers </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Add 1 to each element of the vector using vectorization</span></span>
<span class="line"><span style="color: #F8F8F2">numbers </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> numbers </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Print the resulting vector</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(numbers)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="5e13"><strong>Python:</strong></p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395828247070312px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Import the NumPy library
import numpy as np

# Create a NumPy array of numbers
numbers = np.array([1, 2, 3, 4, 5])

# Add 1 to each element of the array using vectorization
numbers = numbers + 1

# Print the resulting array
print(numbers)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Import the NumPy library</span></span>
<span class="line"><span style="color: #F92672">import</span><span style="color: #F8F8F2"> numpy </span><span style="color: #F92672">as</span><span style="color: #F8F8F2"> np</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Create a NumPy array of numbers</span></span>
<span class="line"><span style="color: #F8F8F2">numbers </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> np.array([</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2">])</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Add 1 to each element of the array using vectorization</span></span>
<span class="line"><span style="color: #F8F8F2">numbers </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> numbers </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Print the resulting array</span></span>
<span class="line"><span style="color: #66D9EF">print</span><span style="color: #F8F8F2">(numbers)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="8ddf"><strong>Java:</strong></p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.39581298828125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="// Import the Apache Commons Math library
import org.apache.commons.math3.util.FastMath;

// Create a Java array of numbers
double[] numbers = {1, 2, 3, 4, 5};

// Use the mapToDouble function from the Apache Commons Math library to apply a vectorized operation to the array
double[] squared = Arrays.stream(numbers).mapToDouble(x -> FastMath.pow(x, 2)).toArray();

// Print the resulting array
System.out.println(Arrays.toString(squared));" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F">// Import the Apache Commons Math library</span></span>
<span class="line"><span style="color: #F92672">import</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">org.apache.commons.math3.util.FastMath</span><span style="color: #F8F8F2">;</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F">// Create a Java array of numbers</span></span>
<span class="line"><span style="color: #66D9EF">double</span><span style="color: #F8F8F2">[] numbers </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> {</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">5</span><span style="color: #F8F8F2">};</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F">// Use the mapToDouble function from the Apache Commons Math library to apply a vectorized operation to the array</span></span>
<span class="line"><span style="color: #66D9EF">double</span><span style="color: #F8F8F2">[] squared </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> Arrays.</span><span style="color: #A6E22E">stream</span><span style="color: #F8F8F2">(numbers).</span><span style="color: #A6E22E">mapToDouble</span><span style="color: #F8F8F2">(x </span><span style="color: #66D9EF">-></span><span style="color: #F8F8F2"> FastMath.</span><span style="color: #A6E22E">pow</span><span style="color: #F8F8F2">(x, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">)).</span><span style="color: #A6E22E">toArray</span><span style="color: #F8F8F2">();</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F">// Print the resulting array</span></span>
<span class="line"><span style="color: #F8F8F2">System.out.</span><span style="color: #A6E22E">println</span><span style="color: #F8F8F2">(Arrays.</span><span style="color: #A6E22E">toString</span><span style="color: #F8F8F2">(squared));</span></span></code></pre></div>



<h2 class="wp-block-heading" id="27aa">Advantages of vectorization</h2>



<p class="wp-block-paragraph" id="f63b">Vectorization has several advantages over looping. First, vectorized operations are generally faster than looping, because they take advantage of optimized routines and the underlying structure of vectors. Second, vectorization is often easier to read and understand than looping, because it uses concise and expressive syntax. Finally, vectorization can improve the maintainability of your code, because it is easier to modify and debug than looping.</p>



<p class="wp-block-paragraph" id="3650">However, it’s worth noting that vectorization is not always possible or appropriate. In some cases, you may need to use a loop to perform an operation that is not vectorizable, or to perform an operation that depends on the previous iteration. In these cases, looping may be necessary or more appropriate.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading" id="e9e4">IV. When to use loops vs. vectorization</h2>



<p class="wp-block-paragraph" id="bd5f">Loops and vectorization are two different approaches to performing the same task in programming. In general, vectorization is preferred because it is faster and more efficient than looping, but there are situations where loops may be necessary or more appropriate. In this section, we will explore when to use loops vs. vectorization.</p>



<h2 class="wp-block-heading" id="7048">Situations where loops are necessary</h2>



<p class="wp-block-paragraph" id="2445">There are several situations where loops may be necessary or more appropriate than vectorization. One such situation is when you need to perform an operation that is not vectorizable, such as reading a file line by line or interacting with a user through the console. In these cases, a loop is the only way to achieve the desired behavior.</p>



<p class="wp-block-paragraph" id="2366">Another situation where loops may be necessary is when you need to perform an operation that depends on the previous iteration. For example, consider the following code, which uses a loop to calculate the factorial of a number:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70486307144165px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="factorial <- function(n) {
  result <- 1
  for (i in 1:n) {
    result <- result * i
  }
  return(result)
}" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #A6E22E">factorial</span><span style="color: #F8F8F2"> </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">function</span><span style="color: #F8F8F2">(n) {</span></span>
<span class="line"><span style="color: #F8F8F2">  result </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">for</span><span style="color: #F8F8F2"> (i </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span><span style="color: #F92672">:</span><span style="color: #F8F8F2">n) {</span></span>
<span class="line"><span style="color: #F8F8F2">    result </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> result </span><span style="color: #F92672">*</span><span style="color: #F8F8F2"> i</span></span>
<span class="line"><span style="color: #F8F8F2">  }</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">return</span><span style="color: #F8F8F2">(result)</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span></code></pre></div>



<p class="wp-block-paragraph" id="307a">In this case, the factorial of a number is calculated by multiplying the current number by the result of the previous iteration. This operation cannot be vectorized because it depends on the previous iteration.</p>



<h2 class="wp-block-heading" id="db38">Situations where vectorization is preferred</h2>



<p class="wp-block-paragraph" id="3f9b">In general, vectorization is preferred over looping because it is faster and more efficient. This is particularly true for large datasets, where the overhead of calling a looping construct and iterating over each element individually can significantly impact performance.</p>



<p class="wp-block-paragraph" id="8c1d">Vectorization is also often easier to read and understand than looping because it uses concise and expressive syntax. This can improve the maintainability of your code because it is easier to modify and debug than looping.</p>



<h2 class="wp-block-heading" id="91ff">Trade-offs between loops and vectorization</h2>



<p class="wp-block-paragraph" id="c49f">There are trade-offs to consider when choosing between loops and vectorization. Loops may be slower and less efficient than vectorized operations, particularly for large datasets. However, loops can be more flexible and easier to modify than vectorized operations, particularly when the operation depends on the previous iteration.</p>



<p class="wp-block-paragraph" id="b450">In general, it is important to carefully consider the performance and readability of your code when choosing between loops and vectorization and to choose the appropriate approach for your specific needs.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading" id="7831">V. Best practices for using loops and vectorization</h2>



<p class="wp-block-paragraph" id="2a38">Using loops and vectorization effectively can have a significant impact on the efficiency and performance of your code. In this section, we will discuss some best practices for using loops and vectorization in programming languages.</p>



<h2 class="wp-block-heading" id="64be">Tips for optimizing loop performance</h2>



<p class="wp-block-paragraph" id="ba66">There are several ways to optimize the performance of loops:</p>



<ol class="wp-block-list">
<li>Use the appropriate looping construct: Choose the looping construct that is most appropriate for your specific needs. For example, use a <code>for</code> loop to iterate over a sequence of objects, use a <code>while</code> loop to continue executing as long as a certain condition is true, or use a <code>repeat</code> loop to execute at least once before checking the condition.</li>



<li>Avoid unnecessary calculations: Only perform calculations that are necessary for the current iteration. Avoid performing unnecessary calculations or creating unnecessary variables, as this can slow down your loop.</li>



<li>Pre-allocate memory: If you are creating an object within the loop, such as a list or an array, pre-allocate memory for it before the loop starts. This can improve the performance of your loop by avoiding the overhead of repeatedly reallocating memory.</li>



<li>Use optimized functions: Use optimized functions and libraries, such as the <code>apply</code> family of functions in R or the NumPy library in Python, to perform common operations. These functions are generally faster and more efficient than looping.</li>
</ol>



<h2 class="wp-block-heading" id="e19d">Tips for choosing between loops and vectorization</h2>



<p class="wp-block-paragraph" id="e6a7"><strong>When choosing between loops and vectorization, consider the following factors:</strong></p>



<ol class="wp-block-list">
<li>Performance: Vectorization is generally faster and more efficient than looping, particularly for large datasets. However, there are situations where loops may be faster, such as when the vector is very small.</li>



<li>Readability: Vectorization is often easier to read and understand than looping because it uses concise and expressive syntax. This can improve the maintainability of your code because it is easier to modify and debug than looping.</li>



<li>Flexibility: Loops can be more flexible than vectorized operations, particularly when the operation depends on the previous iteration. However, vectorization can be more flexible in some cases, because it allows you to perform operations on multiple elements simultaneously.</li>
</ol>



<p class="wp-block-paragraph" id="59c8">In general, it is important to carefully consider the performance and readability of your code when choosing between loops and vectorization and to choose the appropriate approach for your specific needs.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading" id="afee">VI. Conclusion</h2>



<p class="wp-block-paragraph" id="71ba">In conclusion, loops and vectorization are two important concepts in programming languages that refer to different ways of performing the same task. <strong>Loops are used to iterate over a sequence of objects and execute a set of instructions for each iteration, while vectorization is a way to perform operations on multiple elements of a vector simultaneously.</strong></p>



<p class="wp-block-paragraph" id="9bfe">Vectorization is generally preferred over looping because it is faster and more efficient, and because it often results in more readable and expressive code. However, there are situations where loops may be necessary or more appropriate, such as when the operation is not vectorizable or depends on the previous iteration.</p>



<p class="wp-block-paragraph" id="1abe">It is important to understand when to use loops and when to use vectorization, and to choose the appropriate approach for your specific needs. Some best practices for using loops and vectorization include optimizing loop performance, choosing the appropriate looping construct, and using optimized functions and libraries.</p>



<p class="wp-block-paragraph" id="7c03">Overall, understanding loops and vectorization is crucial for writing efficient and effective code in programming languages.</p>



<p class="wp-block-paragraph"></p>



<p class="wp-block-paragraph">Read More blogs in AnalyticaDSS Blogs here : <a href="https://analyticadss.com/blog">BLOGS</a></p>



<p class="wp-block-paragraph">Read More blogs in Medium : <a href="https://medium.com/@aousabdo">Medium Blogs</a></p>



<p class="wp-block-paragraph">Read More blogs in R-bloggers : <a href="https://www.r-bloggers.com/">https://www.r-bloggers.com</a></p>
<p>The post <a href="https://analyticadss.com/maximizing-efficiency-with-loops-and-vectorization-in-programming-languages/">Maximizing Efficiency with Loops and Vectorization in Programming Languages</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Analyzing Crypto Market using R — Part 2</title>
		<link>https://analyticadss.com/analyzing-cryptocurrency-markets-using-r-part-2/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Mon, 24 Dec 2018 10:34:58 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[Bitcoin]]></category>
		<category><![CDATA[Cryptocurrency]]></category>
		<category><![CDATA[Data Analysis]]></category>
		<category><![CDATA[R]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=4907</guid>

					<description><![CDATA[<p>Correlations in the Crypto World Analyzing crypto market Aous Abdo, WWW.ANALYTICADSS.COMAn interactive version of this post can be found on here. In my previous post I explored bitcoin data from different exchanges, we also covered some arbitrage-related data. In part 2 of this series I will explore alt coin related data. R Libraries Below is a list [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/analyzing-cryptocurrency-markets-using-r-part-2/">Analyzing Crypto Market using R — Part 2</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading" id="bf86">Correlations in the Crypto World</h2>



<p class="wp-block-paragraph">Analyzing crypto market</p>



<p class="wp-block-paragraph"><a href="https://medium.com/u/4f20dbfad286?source=post_page-----b1a0aa44006e--------------------------------" rel="noreferrer noopener" target="_blank">Aous Abdo</a>, <a href="http://www.analyticadss.com/" rel="noreferrer noopener" target="_blank">WWW.ANALYTICADSS.COM</a><br>An interactive version of this post can be found on <a href="https://analyticadss.com/adss_blog/crypto_notebook_part2.nb.html" rel="noreferrer noopener" target="_blank">here</a>.</p>



<p class="wp-block-paragraph" id="ae2d">In my previous post I explored bitcoin data from different exchanges, we also covered some arbitrage-related data. In part 2 of this series I will explore alt coin related data.</p>



<h2 class="wp-block-heading" id="3a8a">R Libraries</h2>



<p class="wp-block-paragraph" id="46c5">Below is a list of R libraries we will be using to help us with our analysis. Not all of them are necessary but they all will make our life easier.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395835876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="library(PoloniexR)
library(data.table)
library(lubridate)
library(Quandl)
library(plyr)
library(stringr)
library(ggplot2)
library(plotly)
library(janitor)
library(quantmod)
library(pryr)
library(corrplot)
library(PerformanceAnalytics)
library(tidyr)
library(MLmetrics)
library(tidyquant)
library(corrr)
library(cowplot)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(PoloniexR)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(data.table)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(lubridate)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(Quandl)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(plyr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(stringr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(ggplot2)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(plotly)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(janitor)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(quantmod)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(pryr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(corrplot)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(PerformanceAnalytics)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(MLmetrics)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyquant)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(corrr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(cowplot)</span></span></code></pre></div>



<h2 class="wp-block-heading" id="7a90">Data</h2>



<p class="wp-block-paragraph" id="f5c0">The best source I know off to get alt-coin data is through <a href="https://cran.r-project.org/web/packages/PoloniexR/index.html" rel="noreferrer noopener" target="_blank">PoloniexR</a>. I have written an R function to help download data.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:23.104170322418213px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="get_alt_data <- function(tz = &quot;UTC&quot;
                         , coin = c(&quot;ETH&quot;, &quot;LTC&quot;)
                         , add_bitcoin = TRUE
                         , return_in_USDT = TRUE
                         , from = &quot;2017-01-01&quot;
                         , to = &quot;2018-04-09&quot;
                         , period = &quot;D&quot;
                         , verbose = FALSE){
  
  # We will be using the public API
  poloniex.public <- PoloniexPublicAPI()
  
  # set the time zone to utc
  Sys.setenv(tz = tz)
  
  # convert from and to into time obj
  from  <- as.POSIXct(paste(from, tz, sep = &quot;&quot;))
  to    <- as.POSIXct(paste(to, tz, sep = &quot;&quot;))
  
  # lists to store data.tables and xts objects
  chart_list <- list()
  dt_list    <- list()
  
  # make sure the coin pair is in upper case
  coin       <- toupper(coin)
  coin_pairs <- paste0(&quot;BTC_&quot;, coin[coin != &quot;BTC&quot;])
  if(add_bitcoin | return_in_USDT) coin_pairs <- c(&quot;USDT_BTC&quot;, coin_pairs)
  
  # loop over the coins to get the data
  for(i in coin_pairs){
    if(verbose)
      invisible(cat('\tGetting data for ', i, ' pair\n'))
    
    # this is a list that will contain the chart data for each coin pair
    try(chart_list[[i]] <- ReturnChartData(theObject = poloniex.public
                                       , pair      = i
                                       , from      = from
                                       , to        = to
                                       , period    = period)
        , silent = TRUE)
    
    # list to contain data.tables 
    try(dt_list[[i]] <- as.data.table(chart_list[[i]]), silent = TRUE)
  }
  
  # convert to data.table and make sure to add a column containing the pairs
  coin_dt <- rbindlist(l = dt_list, use.names = TRUE, idcol = &quot;pair&quot;)
  
  # return data in usdt prices
  if(return_in_USDT){
    # to get the price of the alt coin in usdt is not that simple but we'll do it
    # get a DT of the btc_usdt pair
    btc_usd <- coin_dt[pair == &quot;USDT_BTC&quot;]
    btc_usd <- btc_usd[, .(index, pair, weightedaverage)]
    setnames(btc_usd, c(&quot;Date&quot;, &quot;USDT_BTC_pair&quot;, &quot;USDT_BTC_price&quot;))
    
    # get DT with only alt coins
    alt_coins <- copy(coin_dt)#[pair != &quot;USDT_BTC&quot;]
    
    # now we need to add an index to the alt_coins table, but first we have to rename the index column
    alt_coins[, Date := index]
    alt_coins[, index := 1:.N]
    setkey(alt_coins, index)
    
    # now merge the data tables
    coin_dt_usdt <- merge(x = alt_coins, y = btc_usd, by = &quot;Date&quot;)
    
    # now calcualte the price in usdt
    coin_dt_usdt[, price_usdt := ifelse(pair == &quot;USDT_BTC&quot;, USDT_BTC_price, weightedaverage * USDT_BTC_price)]
    
    # now get rid of the extra columns
    coin_dt_usdt[, c(&quot;USDT_BTC_price&quot;, &quot;USDT_BTC_pair&quot;) := NULL]
    
    # we need to change some column names
    col_names_to_change <- c(&quot;pair&quot;, &quot;high&quot;, &quot;low&quot;, &quot;open&quot;, &quot;close&quot;, &quot;volume&quot;, &quot;quotevolume&quot;, &quot;weightedaverage&quot;)
    col_names <- names(coin_dt_usdt)
    col_names[col_names %in% col_names_to_change] <- paste0(col_names_to_change, '_btc')
    
    setnames(coin_dt_usdt, col_names)
    
    # add a column for the usdt pair
    coin_dt_usdt[, pair_usdt := gsub(&quot;BTC_&quot;, &quot;USDT_&quot;, pair_btc)]
    
    # adjust col order
    setcolorder(coin_dt_usdt, c(1:10, 12, 11))
    
    # set key again
    setkey(coin_dt_usdt, index)
    
    # now get rid of the index column since it is not needed anymore
    coin_dt_usdt[, index := NULL]
    
    # now put together the return list  
    return_list <- list(alt_chart_list = chart_list, alt_dt = coin_dt, alt_usdt_dt = coin_dt_usdt)
  }else{
    return_list <- list(alt_chart_list = chart_list, alt_dt = coin_dt)
  }
  
  return(return_list)
}" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #A6E22E">get_alt_data</span><span style="color: #F8F8F2"> </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">function</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">tz</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;UTC&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">                         , </span><span style="color: #FD971F">coin</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;ETH&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;LTC&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">                         , </span><span style="color: #FD971F">add_bitcoin</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span></span>
<span class="line"><span style="color: #F8F8F2">                         , </span><span style="color: #FD971F">return_in_USDT</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span></span>
<span class="line"><span style="color: #F8F8F2">                         , </span><span style="color: #FD971F">from</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;2017-01-01&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">                         , </span><span style="color: #FD971F">to</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;2018-04-09&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">                         , </span><span style="color: #FD971F">period</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;D&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">                         , </span><span style="color: #FD971F">verbose</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">FALSE</span><span style="color: #F8F8F2">){</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># We will be using the public API</span></span>
<span class="line"><span style="color: #F8F8F2">  poloniex.public </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> PoloniexPublicAPI()</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># set the time zone to utc</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #66D9EF">Sys.setenv</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">tz</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> tz)</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># convert from and to into time obj</span></span>
<span class="line"><span style="color: #F8F8F2">  from  </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">as.POSIXct</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">paste</span><span style="color: #F8F8F2">(from, tz, </span><span style="color: #FD971F">sep</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;&quot;</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">  to    </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">as.POSIXct</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">paste</span><span style="color: #F8F8F2">(to, tz, </span><span style="color: #FD971F">sep</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;&quot;</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># lists to store data.tables and xts objects</span></span>
<span class="line"><span style="color: #F8F8F2">  chart_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">list</span><span style="color: #F8F8F2">()</span></span>
<span class="line"><span style="color: #F8F8F2">  dt_list    </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">list</span><span style="color: #F8F8F2">()</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># make sure the coin pair is in upper case</span></span>
<span class="line"><span style="color: #F8F8F2">  coin       </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">toupper</span><span style="color: #F8F8F2">(coin)</span></span>
<span class="line"><span style="color: #F8F8F2">  coin_pairs </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">paste0</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;BTC_&quot;</span><span style="color: #F8F8F2">, coin[coin </span><span style="color: #F92672">!=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;BTC&quot;</span><span style="color: #F8F8F2">])</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">if</span><span style="color: #F8F8F2">(add_bitcoin </span><span style="color: #F92672">|</span><span style="color: #F8F8F2"> return_in_USDT) coin_pairs </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;USDT_BTC&quot;</span><span style="color: #F8F8F2">, coin_pairs)</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># loop over the coins to get the data</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">for</span><span style="color: #F8F8F2">(i </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> coin_pairs){</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #F92672">if</span><span style="color: #F8F8F2">(verbose)</span></span>
<span class="line"><span style="color: #F8F8F2">      </span><span style="color: #F92672">invisible</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">cat</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&#39;</span><span style="color: #AE81FF">\t</span><span style="color: #E6DB74">Getting data for &#39;</span><span style="color: #F8F8F2">, i, </span><span style="color: #E6DB74">&#39; pair</span><span style="color: #AE81FF">\n</span><span style="color: #E6DB74">&#39;</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># this is a list that will contain the chart data for each coin pair</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #66D9EF">try</span><span style="color: #F8F8F2">(chart_list[[i]] </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ReturnChartData(</span><span style="color: #FD971F">theObject</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> poloniex.public</span></span>
<span class="line"><span style="color: #F8F8F2">                                       , pair      = i</span></span>
<span class="line"><span style="color: #F8F8F2">                                       , from      = from</span></span>
<span class="line"><span style="color: #F8F8F2">                                       , to        = to</span></span>
<span class="line"><span style="color: #F8F8F2">                                       , period    = period)</span></span>
<span class="line"><span style="color: #F8F8F2">        , </span><span style="color: #FD971F">silent</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># list to contain data.tables </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #66D9EF">try</span><span style="color: #F8F8F2">(dt_list[[i]] </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> as.data.table(chart_list[[i]]), </span><span style="color: #FD971F">silent</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">  }</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># convert to data.table and make sure to add a column containing the pairs</span></span>
<span class="line"><span style="color: #F8F8F2">  coin_dt </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> rbindlist(</span><span style="color: #FD971F">l</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> dt_list, </span><span style="color: #FD971F">use.names</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">idcol</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;pair&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># return data in usdt prices</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">if</span><span style="color: #F8F8F2">(return_in_USDT){</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># to get the price of the alt coin in usdt is not that simple but we&#39;ll do it</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># get a DT of the btc_usdt pair</span></span>
<span class="line"><span style="color: #F8F8F2">    btc_usd </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> coin_dt[pair </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;USDT_BTC&quot;</span><span style="color: #F8F8F2">]</span></span>
<span class="line"><span style="color: #F8F8F2">    btc_usd </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> btc_usd[, .(index, pair, weightedaverage)]</span></span>
<span class="line"><span style="color: #F8F8F2">    setnames(btc_usd, </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;Date&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;USDT_BTC_pair&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;USDT_BTC_price&quot;</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># get DT with only alt coins</span></span>
<span class="line"><span style="color: #F8F8F2">    alt_coins </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> copy(coin_dt)</span><span style="color: #88846F">#[pair != &quot;USDT_BTC&quot;]</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># now we need to add an index to the alt_coins table, but first we have to rename the index column</span></span>
<span class="line"><span style="color: #F8F8F2">    alt_coins[, Date </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> index]</span></span>
<span class="line"><span style="color: #F8F8F2">    alt_coins[, index </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span><span style="color: #F92672">:</span><span style="color: #F8F8F2">.N]</span></span>
<span class="line"><span style="color: #F8F8F2">    setkey(alt_coins, index)</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># now merge the data tables</span></span>
<span class="line"><span style="color: #F8F8F2">    coin_dt_usdt </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">merge</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> alt_coins, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> btc_usd, </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Date&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># now calcualte the price in usdt</span></span>
<span class="line"><span style="color: #F8F8F2">    coin_dt_usdt[, price_usdt </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">ifelse</span><span style="color: #F8F8F2">(pair </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;USDT_BTC&quot;</span><span style="color: #F8F8F2">, USDT_BTC_price, weightedaverage </span><span style="color: #F92672">*</span><span style="color: #F8F8F2"> USDT_BTC_price)]</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># now get rid of the extra columns</span></span>
<span class="line"><span style="color: #F8F8F2">    coin_dt_usdt[, </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;USDT_BTC_price&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;USDT_BTC_pair&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">NULL</span><span style="color: #F8F8F2">]</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># we need to change some column names</span></span>
<span class="line"><span style="color: #F8F8F2">    col_names_to_change </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;pair&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;high&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;low&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;open&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;close&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;volume&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;quotevolume&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;weightedaverage&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">    col_names </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">names</span><span style="color: #F8F8F2">(coin_dt_usdt)</span></span>
<span class="line"><span style="color: #F8F8F2">    col_names[col_names </span><span style="color: #F92672">%in%</span><span style="color: #F8F8F2"> col_names_to_change] </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">paste0</span><span style="color: #F8F8F2">(col_names_to_change, </span><span style="color: #E6DB74">&#39;_btc&#39;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    setnames(coin_dt_usdt, col_names)</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># add a column for the usdt pair</span></span>
<span class="line"><span style="color: #F8F8F2">    coin_dt_usdt[, pair_usdt </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">gsub</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;BTC_&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;USDT_&quot;</span><span style="color: #F8F8F2">, pair_btc)]</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># adjust col order</span></span>
<span class="line"><span style="color: #F8F8F2">    setcolorder(coin_dt_usdt, </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">1</span><span style="color: #F92672">:</span><span style="color: #AE81FF">10</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">12</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">11</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># set key again</span></span>
<span class="line"><span style="color: #F8F8F2">    setkey(coin_dt_usdt, index)</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># now get rid of the index column since it is not needed anymore</span></span>
<span class="line"><span style="color: #F8F8F2">    coin_dt_usdt[, index </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">NULL</span><span style="color: #F8F8F2">]</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># now put together the return list  </span></span>
<span class="line"><span style="color: #F8F8F2">    return_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">list</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">alt_chart_list</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> chart_list, </span><span style="color: #FD971F">alt_dt</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> coin_dt, </span><span style="color: #FD971F">alt_usdt_dt</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> coin_dt_usdt)</span></span>
<span class="line"><span style="color: #F8F8F2">  }</span><span style="color: #F92672">else</span><span style="color: #F8F8F2">{</span></span>
<span class="line"><span style="color: #F8F8F2">    return_list </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">list</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">alt_chart_list</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> chart_list, </span><span style="color: #FD971F">alt_dt</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> coin_dt)</span></span>
<span class="line"><span style="color: #F8F8F2">  }</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">return</span><span style="color: #F8F8F2">(return_list)</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span></code></pre></div>



<p class="wp-block-paragraph">The function above can be used to download data for multiple coin at the same time. The function returns a data.table object with data for all coins in the function call. Even if the user doesn’t add bitcoin to the list of coins, the function adds bitcoin by default. This can be deactivated with the add_bitcoin argument. Here is an example</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.6875px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# get alt data for some coins
alt_data <- get_alt_data(return_in_USDT = T
                         , from = &quot;2015-01-01&quot;
                         , coin = c('ETH','XRP', 'BCH', 'LTC', 'NEO', 'XMR', 'DASH', 'XEM'))[['alt_usdt_dt']]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># get alt data for some coins</span></span>
<span class="line"><span style="color: #F8F8F2">alt_data </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> get_alt_data(</span><span style="color: #FD971F">return_in_USDT</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> T</span></span>
<span class="line"><span style="color: #F8F8F2">                         , </span><span style="color: #FD971F">from</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;2015-01-01&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">                         , </span><span style="color: #FD971F">coin</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&#39;ETH&#39;</span><span style="color: #F8F8F2">,</span><span style="color: #E6DB74">&#39;XRP&#39;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&#39;BCH&#39;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&#39;LTC&#39;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&#39;NEO&#39;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&#39;XMR&#39;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&#39;DASH&#39;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&#39;XEM&#39;</span><span style="color: #F8F8F2">))[[</span><span style="color: #E6DB74">&#39;alt_usdt_dt&#39;</span><span style="color: #F8F8F2">]]</span></span></code></pre></div>



<p class="wp-block-paragraph" id="a8ab">Let’s look at the data we just downloaded</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704856872558594px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="head(alt_data)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">head</span><span style="color: #F8F8F2">(alt_data)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="249" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_yabOeX6Rqn8e6eDO1XzGFg.webp" alt="" class="wp-image-4908" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_yabOeX6Rqn8e6eDO1XzGFg.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_yabOeX6Rqn8e6eDO1XzGFg-500x150.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_yabOeX6Rqn8e6eDO1XzGFg-150x45.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_yabOeX6Rqn8e6eDO1XzGFg-768x231.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="9a46">The table shows the date, OHLC, Volume, and weightedaverage price in BTC. It also shows the pair and we added the price in USD.</p>



<h2 class="wp-block-heading" id="c5e5">Bitcoin-Altcoins Correlations</h2>



<p class="wp-block-paragraph" id="dbac">Wheneven I look at the prices of the coins available on my <a href="https://www.coinbase.com/" target="_blank" rel="noreferrer noopener">coinbase</a> app I always get struck by the similarity of the price trends between the four coins available on coinbase: BTC, ETH, BCH, and LTC, see Figure below. So I thought it will be a good idea to explore the correlation in price trends between altcoins and bitcoin.</p>


<div class="wp-block-image">
<figure class="aligncenter size-large"><img loading="lazy" decoding="async" width="661" height="1024" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_6a1iEbXQTe5L9tKq9xs1XA-661x1024.webp" alt="" class="wp-image-4909" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_6a1iEbXQTe5L9tKq9xs1XA-661x1024.webp 661w, https://analyticadss.com/wp-content/uploads/2022/12/1_6a1iEbXQTe5L9tKq9xs1XA-323x500.webp 323w, https://analyticadss.com/wp-content/uploads/2022/12/1_6a1iEbXQTe5L9tKq9xs1XA-97x150.webp 97w, https://analyticadss.com/wp-content/uploads/2022/12/1_6a1iEbXQTe5L9tKq9xs1XA-768x1190.webp 768w, https://analyticadss.com/wp-content/uploads/2022/12/1_6a1iEbXQTe5L9tKq9xs1XA.webp 786w" sizes="auto, (max-width: 661px) 100vw, 661px" /></figure>
</div>


<p class="wp-block-paragraph" id="47a1">Let’s look at price trends of the coins we just downloaded. To better see potential correlations I am going to only zoom in on 2018.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70486307144165px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="p <- ggplot(alt_data[year(Date) == 2018], aes(x = Date, y =  price_usdt, col = pair_usdt)) + geom_line()
p <- p + facet_wrap(~pair_usdt, scales = &quot;free&quot;, ncol = 3) + theme_minimal() + theme(legend.position=&quot;none&quot;) + ylab(&quot;Price (USD)&quot;)
p" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ggplot(alt_data[year(Date) </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2018</span><span style="color: #F8F8F2">], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> Date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2">  price_usdt, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> pair_usdt)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_line()</span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> p </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> facet_wrap(</span><span style="color: #F92672">~</span><span style="color: #F8F8F2">pair_usdt, </span><span style="color: #FD971F">scales</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;free&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">ncol</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_minimal() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme(</span><span style="color: #FD971F">legend.position</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&quot;none&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Price (USD)&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">p</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="499" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_lXmT2c5hQ6tik7sNfwvFPw.webp" alt="" class="wp-image-4910" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_lXmT2c5hQ6tik7sNfwvFPw.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_lXmT2c5hQ6tik7sNfwvFPw-500x301.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_lXmT2c5hQ6tik7sNfwvFPw-150x90.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_lXmT2c5hQ6tik7sNfwvFPw-768x463.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /><figcaption class="wp-element-caption">Prices of Bitcoin and other altcoins in 2018<br></figcaption></figure>
</div>


<p class="wp-block-paragraph" id="3bf6">The figure above shows that some coins seems to be more correlated with Bitcoin than others. The figure also shows that this variablity between Bitcoin and another coin varies over time. More on this below.</p>



<p class="wp-block-paragraph" id="73c0">Tyring to find correlations bewteen time series data using Pearson correlation coefficient or other metrics used with stationary data, time series is not a form of stationary data, can give misleading results. Similar trends in time series data can also be very misleading, a nice article on this topic can be found <a href="https://svds.com/avoiding-common-mistakes-with-time-series/" rel="noreferrer noopener" target="_blank">here</a>. And always remember that <strong>Correlation doesn’t guarantee Causation</strong></p>



<p class="wp-block-paragraph" id="0f96">Bottom line is the following, one has to be careful when cross-correlating time serice. In order to perform proper correlation analysis we need to add some new variables to our table.</p>



<h2 class="wp-block-heading" id="510f">Percentage Daily Change</h2>



<p class="wp-block-paragraph" id="0ed2">Percentage daily change calculates the price change of a coin over a period of a day. Let’s add that to the table. Notice that we are calcualting this variable using the USD price, and not the price in Bitcoin.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# add daily price change
alt_data[, pct_change := Delt(price_usdt), by = pair_usdt]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># add daily price change</span></span>
<span class="line"><span style="color: #F8F8F2">alt_data[, pct_change </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> Delt(price_usdt), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> pair_usdt]</span></span></code></pre></div>



<h2 class="wp-block-heading" id="a775">Normalized Price in USD</h2>



<p class="wp-block-paragraph" id="a42d">Since the prices vary a lot, both overtime for the same coin and between coins, we will add a variable of the normalized price in USD. This variable will make it easy to plot prices of coins on the same figure.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70489501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# add normalized prices in udst
alt_data[, price_usdt_norm := price_usdt/max(price_usdt), by = pair_usdt]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># add normalized prices in udst</span></span>
<span class="line"><span style="color: #F8F8F2">alt_data[, price_usdt_norm </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> price_usdt</span><span style="color: #F92672">/</span><span style="color: #66D9EF">max</span><span style="color: #F8F8F2">(price_usdt), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> pair_usdt]</span></span></code></pre></div>



<p class="wp-block-paragraph" id="9cec">Now that we have the normalized prices in USD, let’s look at the prices of bitcoin and litcoin on the same figure. We’ll do that for 2018 so we can better see any possible correlations.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="p <- ggplot(alt_data[year(Date) == 2018 & pair_usdt %like% &quot;BTC|LTC&quot;], aes(x = Date, y =  price_usdt_norm, col = pair_usdt)) + geom_line()
p <- p + theme_minimal() + ylab(&quot;Price (USD)&quot;)
ggplotly(p)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ggplot(alt_data[year(Date) </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2018</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">&</span><span style="color: #F8F8F2"> pair_usdt </span><span style="color: #F92672">%like%</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;BTC|LTC&quot;</span><span style="color: #F8F8F2">], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> Date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2">  price_usdt_norm, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> pair_usdt)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_line()</span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> p </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_minimal() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Price (USD)&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">ggplotly(p)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="483" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_HHs59LsNRYT3sbL8I0Sdmg.webp" alt="" class="wp-image-4913" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_HHs59LsNRYT3sbL8I0Sdmg.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_HHs59LsNRYT3sbL8I0Sdmg-500x292.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_HHs59LsNRYT3sbL8I0Sdmg-150x88.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_HHs59LsNRYT3sbL8I0Sdmg-768x448.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph">The trends in the prices of BTC and LTC are very similar, Let’s look at price trends for 2017</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="p <- ggplot(alt_data[year(Date) == 2017 & pair_usdt %like% &quot;BTC|LTC&quot;], aes(x = Date, y =  price_usdt_norm, col = pair_usdt)) + geom_line()
p <- p + theme_minimal() + ylab(&quot;Price (USD)&quot;)
ggplotly(p)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ggplot(alt_data[year(Date) </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2017</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">&</span><span style="color: #F8F8F2"> pair_usdt </span><span style="color: #F92672">%like%</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;BTC|LTC&quot;</span><span style="color: #F8F8F2">], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> Date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2">  price_usdt_norm, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> pair_usdt)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_line()</span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> p </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_minimal() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Price (USD)&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">ggplotly(p)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="480" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_6t9j_F4QaweBgp_Wl7Ntwg.webp" alt="" class="wp-image-4911" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_6t9j_F4QaweBgp_Wl7Ntwg.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_6t9j_F4QaweBgp_Wl7Ntwg-500x290.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_6t9j_F4QaweBgp_Wl7Ntwg-150x87.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_6t9j_F4QaweBgp_Wl7Ntwg-768x445.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /><figcaption class="wp-element-caption">Prices of Bitcoin and LTC in 2017</figcaption></figure>
</div>


<p class="wp-block-paragraph" id="c49a">Seems like we need to zoon in on the last quarter of 2017, let’s do that</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704862594604492px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="p <- ggplot(alt_data[Date >= &quot;2017-10-01&quot;  & Date < &quot;2018-01-01&quot; &  pair_usdt %like% &quot;BTC|LTC&quot;], aes(x = Date, y =  price_usdt_norm, col = pair_usdt)) + geom_line()
p <- p + theme_minimal() + ylab(&quot;Price (USD)&quot;)
ggplotly(p)
" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ggplot(alt_data[Date </span><span style="color: #F92672">>=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;2017-10-01&quot;</span><span style="color: #F8F8F2">  </span><span style="color: #F92672">&</span><span style="color: #F8F8F2"> Date </span><span style="color: #F92672"><</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;2018-01-01&quot;</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">&</span><span style="color: #F8F8F2">  pair_usdt </span><span style="color: #F92672">%like%</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;BTC|LTC&quot;</span><span style="color: #F8F8F2">], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> Date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2">  price_usdt_norm, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> pair_usdt)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_line()</span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> p </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_minimal() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Price (USD)&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">ggplotly(p)</span></span>
<span class="line"></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="490" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_yEZOrUPaK-ZhizMdmeFCmA.webp" alt="" class="wp-image-4914" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_yEZOrUPaK-ZhizMdmeFCmA.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_yEZOrUPaK-ZhizMdmeFCmA-500x296.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_yEZOrUPaK-ZhizMdmeFCmA-150x89.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_yEZOrUPaK-ZhizMdmeFCmA-768x454.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /><figcaption class="wp-element-caption">Prices of Bitcoin and LTC in the last quarter of 2017<br></figcaption></figure>
</div>


<p class="wp-block-paragraph" id="fcd8">It is clear from the above figure that the correlation in the prices of bitcoin and LTC vary over time. Note how the highest price for bitcoin on December 17 2017, preceded that of LTC by two days, which occurred on December 19 2017. This wasn’t the case for the ATH which occurred on January 6th 2018 for both coins.</p>



<h2 class="wp-block-heading" id="6009">Static Correlations </h2>



<h3 class="wp-block-heading" id="6009">(and why you shouldn’t use them with crypto!)</h3>



<p class="wp-block-paragraph" id="9035">Up until now I haven’t calculated any correltaions between the price of different coins. You might ask why should we even care about correlations in time series. Well, in the case of financial time series data, if one can show that a correlation exists between two time series then one can use this correlation to model/predict the price movement of one coin/stock given the price trends of another coin/stock. </p>



<p class="wp-block-paragraph" id="9035"><strong>However</strong>, as we mentioned earlier, correlation for time series data is not static, it changes over time. Actually let’s show that. To do that I am going to be calculating the <a href="https://en.wikipedia.org/wiki/Pearson_correlation_coefficient" target="_blank" rel="noreferrer noopener">Pearson correlation coefficient</a>. In simple words, Pearson correlation coefficient for two vectors of data is a measure that shows how correlated these two vectors of data are. The value of this coefficient varies from -1, perfectly anti-correlated, to 1, perfectly correlated. So the correlation coefficient for a series of numbers on itself is 1. A value of zero means these is no correlation. Remember, this only works for static data.</p>



<p class="wp-block-paragraph" id="62e8">In order to perform correlation on our data I am going to need to do some data transformation:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70489501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# subset data, only keep the date, the pair, and the price
alt_data_sub <- alt_data[, .(Date, pair_usdt, price_usdt)]
# convert to wide format 
alt_data_sub <- spread(data = alt_data_sub, key = &quot;pair_usdt&quot;, value = &quot;price_usdt&quot;)
# clean column names
setnames(alt_data_sub, gsub(&quot;USDT_&quot;, &quot;&quot;, colnames(alt_data_sub)))" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># subset data, only keep the date, the pair, and the price</span></span>
<span class="line"><span style="color: #F8F8F2">alt_data_sub </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> alt_data[, .(Date, pair_usdt, price_usdt)]</span></span>
<span class="line"><span style="color: #88846F"># convert to wide format </span></span>
<span class="line"><span style="color: #F8F8F2">alt_data_sub </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> spread(</span><span style="color: #FD971F">data</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> alt_data_sub, </span><span style="color: #FD971F">key</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;pair_usdt&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">value</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;price_usdt&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #88846F"># clean column names</span></span>
<span class="line"><span style="color: #F8F8F2">setnames(alt_data_sub, </span><span style="color: #66D9EF">gsub</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;USDT_&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #66D9EF">colnames</span><span style="color: #F8F8F2">(alt_data_sub)))</span></span></code></pre></div>



<p class="wp-block-paragraph">The new table we created contains the date along with the prices in USDT for each coin we have in our table.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.7048492431640625px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="tail(alt_data_sub)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">tail</span><span style="color: #F8F8F2">(alt_data_sub)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="236" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_3onSqsAzx-A1jNaprt9BMw.webp" alt="" class="wp-image-4915" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_3onSqsAzx-A1jNaprt9BMw.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_3onSqsAzx-A1jNaprt9BMw-500x143.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_3onSqsAzx-A1jNaprt9BMw-150x43.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_3onSqsAzx-A1jNaprt9BMw-768x219.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph">Again, what I am doing here is not correct, I am just trying to show you why we shouldn’t be doing static correlations on crypto data. Now we’ll calculate the Pearson correlation coefficient between the coins we have, then we are going to make a nice plot of these coefficients.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704833984375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# calculate the correlation matrix
M <- cor(alt_data_sub[, -1], use = &quot;complete.obs&quot;) # notice how we are ignoring missing data with the last argument
# plot the correlation matrix
corrplot.mixed(corr = M, upper = &quot;ellipse&quot;, lower = &quot;number&quot;, order = &quot;AOE&quot;, tl.col = &quot;black&quot;)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># calculate the correlation matrix</span></span>
<span class="line"><span style="color: #F8F8F2">M </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">cor</span><span style="color: #F8F8F2">(alt_data_sub[, </span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">], </span><span style="color: #FD971F">use</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;complete.obs&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #88846F"># notice how we are ignoring missing data with the last argument</span></span>
<span class="line"><span style="color: #88846F"># plot the correlation matrix</span></span>
<span class="line"><span style="color: #F8F8F2">corrplot.mixed(</span><span style="color: #FD971F">corr</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> M, </span><span style="color: #FD971F">upper</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;ellipse&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">lower</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;number&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">order</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;AOE&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">tl.col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;black&quot;</span><span style="color: #F8F8F2">)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="826" height="722" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_mjAaFbtDf5GeYp6RCppeSA.webp" alt="" class="wp-image-4916" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_mjAaFbtDf5GeYp6RCppeSA.webp 826w, https://analyticadss.com/wp-content/uploads/2022/12/1_mjAaFbtDf5GeYp6RCppeSA-500x437.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_mjAaFbtDf5GeYp6RCppeSA-150x131.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_mjAaFbtDf5GeYp6RCppeSA-768x671.webp 768w" sizes="auto, (max-width: 826px) 100vw, 826px" /></figure>
</div>


<p class="wp-block-paragraph" id="99ac">The figure above shows the correlation coefficients between the different coins. It is easy to read, visually, the darker the color of the ellipse, and the more diagonal the ellipse, the higher the correlation coefficient. Of course you can also just look at the numbers on the bottom left part of the figure to get the value of the coefficient between two coins :). The figure shows how highly correlated the prices of crypto currencies can be. For example XRP and XEM have a correlation coefficient of 0.93. The highest correlation seems to be between BCH and DASH at 0.97 correlation coefficient.</p>



<p class="wp-block-paragraph" id="bcf4">All of the correlation coefficient we see in the above figure are significant, the question is, do these correlations vary over time. To answer this question I will calculate the correlation coefficient between Bitcoin and DASH on a monthly basis, you can do that for any time period, and will show that this coefficient varies greatly over time. Let’s do that</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395828247070312px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# subset the data
btc_dash <- alt_data_sub[, .(Date, BTC, DASH)]
# add a year_month column
btc_dash[, year_month := as.yearmon(Date)]
# calculate the correlation coefficient on montly basis
btc_dash_2 <- btc_dash[, cor(BTC, DASH), by = year_month]
# now plot the correlation coefficient as a function of month and year
plot(btc_dash_2$year_month, btc_dash_2$V1, xlab = &quot;Year-Month&quot;, main = &quot;Correlation Coeff. Between BTC and DASH Over time&quot;
     , ylab = &quot;Correlation Coefficient&quot;, type = &quot;b&quot;, pch = 19, col = ifelse(btc_dash_2$V1 > 0, &quot;blue&quot;, &quot;red&quot;)
     , ylim = c(-1, 1))" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># subset the data</span></span>
<span class="line"><span style="color: #F8F8F2">btc_dash </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> alt_data_sub[, .(Date, BTC, DASH)]</span></span>
<span class="line"><span style="color: #88846F"># add a year_month column</span></span>
<span class="line"><span style="color: #F8F8F2">btc_dash[, year_month </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> as.yearmon(Date)]</span></span>
<span class="line"><span style="color: #88846F"># calculate the correlation coefficient on montly basis</span></span>
<span class="line"><span style="color: #F8F8F2">btc_dash_2 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> btc_dash[, </span><span style="color: #66D9EF">cor</span><span style="color: #F8F8F2">(BTC, DASH), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> year_month]</span></span>
<span class="line"><span style="color: #88846F"># now plot the correlation coefficient as a function of month and year</span></span>
<span class="line"><span style="color: #66D9EF">plot</span><span style="color: #F8F8F2">(btc_dash_2</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">year_month, btc_dash_2</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">V1, </span><span style="color: #FD971F">xlab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Year-Month&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">main</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Correlation Coeff. Between BTC and DASH Over time&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">     , </span><span style="color: #FD971F">ylab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Correlation Coefficient&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">type</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;b&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">pch</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">19</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">ifelse</span><span style="color: #F8F8F2">(btc_dash_2</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">V1 </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;blue&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;red&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">     , </span><span style="color: #FD971F">ylim</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">))</span></span></code></pre></div>



<p class="wp-block-paragraph" id="094a">This is interesting, the value of the monthly correlation coefficient between bitcoin and DASH varies between <strong>-0.91</strong>, highly anti-correlated, to <strong>0.98</strong>, highly correlated. And this is why <strong><em>you should never use static correlation metrics with crypto data!</em></strong></p>



<p class="wp-block-paragraph" id="78e2">A good blog post on this same topic is written by <a href="https://twitter.com/tomeff" rel="noreferrer noopener" target="_blank">Tom Fawcett</a> from <a href="https://www.svds.com/" rel="noreferrer noopener" target="_blank">Silicon Valley Data Science</a> and can be found <a href="https://www.svds.com/avoiding-common-mistakes-with-time-series/" rel="noreferrer noopener" target="_blank">here</a>. In his post Tom shows, with simple simulations, why static correlations should never be used with time series.</p>



<h3 class="wp-block-heading" id="2441">Correlation Networks</h3>



<p class="wp-block-paragraph" id="a16e">There is one more plot I would like to make, which is a network plot of the correlations between the different coins. The correlation network plot helps show strengths of correlation between the different coins. Agian, these correlations are time dependent and the figure we will be making will change over time, but I still think it is a good figure to make. Here it is:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395835876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# we will be using the great corrr package for this work
# get the correlation matrix, just like we did before
# build the correlation matrix 
# the code snippets below are taken from, that is a great blog BTW 
# http://www.business-science.io/timeseries-analysis/2017/07/30/tidy-timeseries-analysis-pt-3.html
corr_2 <- correlate(alt_data_sub[, -1])
# make the network plot
# Network plot
corr_net <- corr_2 %>%
  network_plot(colours = c(palette_light()[[2]], &quot;white&quot;, palette_light()[[4]]), legend = TRUE) +
  labs(
    title = &quot;Static Correlations of some Crypto Currencies&quot;,
    subtitle = &quot;2014 through 2018&quot;
  ) +
  theme_tq() +
  theme(legend.position = &quot;bottom&quot;)
corr_net" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># we will be using the great corrr package for this work</span></span>
<span class="line"><span style="color: #88846F"># get the correlation matrix, just like we did before</span></span>
<span class="line"><span style="color: #88846F"># build the correlation matrix </span></span>
<span class="line"><span style="color: #88846F"># the code snippets below are taken from, that is a great blog BTW </span></span>
<span class="line"><span style="color: #88846F"># http://www.business-science.io/timeseries-analysis/2017/07/30/tidy-timeseries-analysis-pt-3.html</span></span>
<span class="line"><span style="color: #F8F8F2">corr_2 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> correlate(alt_data_sub[, </span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">])</span></span>
<span class="line"><span style="color: #88846F"># make the network plot</span></span>
<span class="line"><span style="color: #88846F"># Network plot</span></span>
<span class="line"><span style="color: #F8F8F2">corr_net </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> corr_2 </span><span style="color: #F92672">%>%</span></span>
<span class="line"><span style="color: #F8F8F2">  network_plot(</span><span style="color: #FD971F">colours</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(palette_light()[[</span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">]], </span><span style="color: #E6DB74">&quot;white&quot;</span><span style="color: #F8F8F2">, palette_light()[[</span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">]]), </span><span style="color: #FD971F">legend</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  labs(</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">title</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Static Correlations of some Crypto Currencies&quot;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">subtitle</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;2014 through 2018&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">  ) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  theme_tq() </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  theme(</span><span style="color: #FD971F">legend.position</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;bottom&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">corr_net</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="519" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_brRhP1L95H00I6m3TcD-2A.webp" alt="" class="wp-image-4917" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_brRhP1L95H00I6m3TcD-2A.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_brRhP1L95H00I6m3TcD-2A-500x313.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_brRhP1L95H00I6m3TcD-2A-150x94.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_brRhP1L95H00I6m3TcD-2A-768x481.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="9225">The figure above shows a network which measures how strongly correlated the prices of the coins under stugy are. The darker the color of the edge, line, connecting two coins and the closer they are in the network the stronger the correlation between these two coins.</p>



<p class="wp-block-paragraph" id="851b">From the figure, it seems like XMR, LTC, and BTC are in the heart of this network, while BCH seems to be the least correlated with the rest of the coins. Let’s see how the network plot changes between 2017 and 2018:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395835876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# subset the data and get correlation matrices
corr_2017 <- correlate(alt_data_sub[year(Date) == 2017][, -1])
corr_2018 <- correlate(alt_data_sub[year(Date) == 2018][, -1])
# build Network plots
corr_net_2017 <- corr_2017 %>%
  network_plot(colours = c(palette_light()[[2]], &quot;white&quot;, palette_light()[[4]]), legend = TRUE) +
  labs(
    title = &quot;Static Correlations of some Crypto Currencies&quot;,
    subtitle = &quot;2017&quot;
  ) +
  theme_tq() +
  theme(legend.position = &quot;bottom&quot;)
corr_net_2018 <- corr_2018 %>%
  network_plot(colours = c(palette_light()[[2]], &quot;white&quot;, palette_light()[[4]]), legend = TRUE) +
  labs(
    title = &quot;Static Correlations of some Crypto Currencies&quot;,
    subtitle = &quot;2018&quot;
  ) +
  theme_tq() +
  theme(legend.position = &quot;bottom&quot;)
# combine network plots
cow_net_plots <-plot_grid(corr_net_2017, corr_net_2018, ncol = 2)
title <- ggdraw() + 
    draw_label(label = 'Crypto Correlation Networks',
               fontface = 'bold', size = 18)
cow_out <- plot_grid(title, cow_net_plots, ncol=1, rel_heights=c(0.1, 1))
cow_out" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># subset the data and get correlation matrices</span></span>
<span class="line"><span style="color: #F8F8F2">corr_2017 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> correlate(alt_data_sub[year(Date) </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2017</span><span style="color: #F8F8F2">][, </span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">])</span></span>
<span class="line"><span style="color: #F8F8F2">corr_2018 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> correlate(alt_data_sub[year(Date) </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2018</span><span style="color: #F8F8F2">][, </span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">])</span></span>
<span class="line"><span style="color: #88846F"># build Network plots</span></span>
<span class="line"><span style="color: #F8F8F2">corr_net_2017 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> corr_2017 </span><span style="color: #F92672">%>%</span></span>
<span class="line"><span style="color: #F8F8F2">  network_plot(</span><span style="color: #FD971F">colours</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(palette_light()[[</span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">]], </span><span style="color: #E6DB74">&quot;white&quot;</span><span style="color: #F8F8F2">, palette_light()[[</span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">]]), </span><span style="color: #FD971F">legend</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  labs(</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">title</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Static Correlations of some Crypto Currencies&quot;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">subtitle</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;2017&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">  ) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  theme_tq() </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  theme(</span><span style="color: #FD971F">legend.position</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;bottom&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">corr_net_2018 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> corr_2018 </span><span style="color: #F92672">%>%</span></span>
<span class="line"><span style="color: #F8F8F2">  network_plot(</span><span style="color: #FD971F">colours</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(palette_light()[[</span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">]], </span><span style="color: #E6DB74">&quot;white&quot;</span><span style="color: #F8F8F2">, palette_light()[[</span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">]]), </span><span style="color: #FD971F">legend</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  labs(</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">title</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Static Correlations of some Crypto Currencies&quot;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">subtitle</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;2018&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">  ) </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  theme_tq() </span><span style="color: #F92672">+</span></span>
<span class="line"><span style="color: #F8F8F2">  theme(</span><span style="color: #FD971F">legend.position</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;bottom&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #88846F"># combine network plots</span></span>
<span class="line"><span style="color: #F8F8F2">cow_net_plots </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2">plot_grid(corr_net_2017, corr_net_2018, </span><span style="color: #FD971F">ncol</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">title </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ggdraw() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> </span></span>
<span class="line"><span style="color: #F8F8F2">    draw_label(</span><span style="color: #FD971F">label</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&#39;Crypto Correlation Networks&#39;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">               </span><span style="color: #FD971F">fontface</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&#39;bold&#39;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">size</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">18</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">cow_out </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> plot_grid(title, cow_net_plots, </span><span style="color: #FD971F">ncol</span><span style="color: #F92672">=</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">rel_heights</span><span style="color: #F92672">=</span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">0.1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">cow_out</span></span></code></pre></div>



<p class="wp-block-paragraph" id="058c">As can be seen, the correlation networks do change overtime. This is not news since we already saw in the previous section that the value of the correlation varies overtime (I know we showed this to be true for the BTC-DASH air but we’ll show that this is true for the rest of the coins in the next section.)</p>



<h3 class="wp-block-heading" id="eace">Daily Returns Correlations</h3>



<p class="wp-block-paragraph" id="ffb8">Let’s look at the percentage daily changes of the altcoins between 2015 and today.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# plot the percent changes
p <- ggplot(alt_data[Date > ymd(&quot;2015-01-01&quot;)], aes(x = Date, y =  (100*pct_change), col = pair_usdt)) + geom_line()
p <- p + ggtitle(&quot;% Daily Returns over time&quot;) + ylab(&quot;Daily Return (%)&quot;) 
p <- p + theme_bw() + guides(col=guide_legend(title=&quot;Coin Pair&quot;))
ggplotly(p)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># plot the percent changes</span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ggplot(alt_data[Date </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> ymd(</span><span style="color: #E6DB74">&quot;2015-01-01&quot;</span><span style="color: #F8F8F2">)], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> Date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2">  (</span><span style="color: #AE81FF">100</span><span style="color: #F92672">*</span><span style="color: #F8F8F2">pct_change), </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> pair_usdt)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_line()</span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> p </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ggtitle(</span><span style="color: #E6DB74">&quot;% Daily Returns over time&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Daily Return (%)&quot;</span><span style="color: #F8F8F2">) </span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> p </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_bw() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> guides(</span><span style="color: #FD971F">col</span><span style="color: #F92672">=</span><span style="color: #F8F8F2">guide_legend(</span><span style="color: #FD971F">title</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&quot;Coin Pair&quot;</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">ggplotly(p)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="492" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_DF6wejykPy9QxICaxujyWg.webp" alt="" class="wp-image-4918" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_DF6wejykPy9QxICaxujyWg.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_DF6wejykPy9QxICaxujyWg-500x297.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_DF6wejykPy9QxICaxujyWg-150x89.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_DF6wejykPy9QxICaxujyWg-768x456.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /><figcaption class="wp-element-caption">Percentage daily returns for some coins</figcaption></figure>
</div>


<p class="wp-block-paragraph" id="def1">Although the above figure is very cluttered, one thing is certain, percentage daily returns vary greatly for crypto. Let’s try to make this figure a bit easier to read</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="p <- ggplot(alt_data[Date > ymd(&quot;2015-01-01&quot;)], aes(x = Date, y =  (100*pct_change), col = pair_usdt)) + geom_line() + facet_wrap(~ pair_usdt)
p <- p + ggtitle(&quot;Percentage Daily Returns over time&quot;) + ylab(&quot;Daily Return (%)&quot;) 
p <- p + theme_bw() + theme(legend.position=&quot;none&quot;) 
ggplotly(p)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ggplot(alt_data[Date </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> ymd(</span><span style="color: #E6DB74">&quot;2015-01-01&quot;</span><span style="color: #F8F8F2">)], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> Date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2">  (</span><span style="color: #AE81FF">100</span><span style="color: #F92672">*</span><span style="color: #F8F8F2">pct_change), </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> pair_usdt)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_line() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> facet_wrap(</span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> pair_usdt)</span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> p </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ggtitle(</span><span style="color: #E6DB74">&quot;Percentage Daily Returns over time&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Daily Return (%)&quot;</span><span style="color: #F8F8F2">) </span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> p </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_bw() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme(</span><span style="color: #FD971F">legend.position</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&quot;none&quot;</span><span style="color: #F8F8F2">) </span></span>
<span class="line"><span style="color: #F8F8F2">ggplotly(p)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="563" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_frUEGj6qn8sgyJOH87UsWA.webp" alt="" class="wp-image-4919" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_frUEGj6qn8sgyJOH87UsWA.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_frUEGj6qn8sgyJOH87UsWA-500x340.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_frUEGj6qn8sgyJOH87UsWA-150x102.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_frUEGj6qn8sgyJOH87UsWA-768x522.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /><figcaption class="wp-element-caption">Percentage daily returns for some coins</figcaption></figure>
</div>


<p class="wp-block-paragraph" id="dc7b">It is kind of surprising that Bitcoin has the least variability in daily returns. The nice big spike around April 2nd 2017 shows a percentage daily return of ~88% for XRP, this is the highest daily return I have seen!</p>



<p class="wp-block-paragraph" id="1715">Let’s look at the percentage daily returns for Bitcoin and Litecoin since they seem to be highly correlated. I am going to zoom in on the time period 2016–02–01 and 2016–05–01.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70486307144165px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="start_date <- ymd(&quot;2016-02-01&quot;)
end_date <- ymd(&quot;2016-05-01&quot;)
p <- ggplot(alt_data[pair_usdt %like% &quot;BTC|LTC&quot; & Date > start_date & Date < end_date], aes(x = Date, y =  (100*pct_change), col = pair_usdt)) + geom_line() + theme_bw() + ylab(&quot;Price (USD)&quot;)
p" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">start_date </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ymd(</span><span style="color: #E6DB74">&quot;2016-02-01&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">end_date </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ymd(</span><span style="color: #E6DB74">&quot;2016-05-01&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">p </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> ggplot(alt_data[pair_usdt </span><span style="color: #F92672">%like%</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;BTC|LTC&quot;</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">&</span><span style="color: #F8F8F2"> Date </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> start_date </span><span style="color: #F92672">&</span><span style="color: #F8F8F2"> Date </span><span style="color: #F92672"><</span><span style="color: #F8F8F2"> end_date], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> Date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2">  (</span><span style="color: #AE81FF">100</span><span style="color: #F92672">*</span><span style="color: #F8F8F2">pct_change), </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> pair_usdt)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_line() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_bw() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Price (USD)&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">p</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="480" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_f0yWVUQ02uKmgNS3Euz8cA.webp" alt="" class="wp-image-4920" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_f0yWVUQ02uKmgNS3Euz8cA.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_f0yWVUQ02uKmgNS3Euz8cA-500x290.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_f0yWVUQ02uKmgNS3Euz8cA-150x87.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_f0yWVUQ02uKmgNS3Euz8cA-768x445.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /><figcaption class="wp-element-caption">Daily Return for Bitcoin and LTC in 2018</figcaption></figure>
</div>


<p class="wp-block-paragraph" id="db3e">The figure shows clear correlation between the daily returns of Bitcoin and litcoin. It also shows that these correlations can vary overtime. In fact, let’s look at how these correlations vary overtime.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395828247070312px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# these steps are similar to the ones in the previous section, the only differnect is that now we are looking at the percentage change in price difference on daily basis instead of the actual price
# subset data, only keep the date, the pair, and the price
alt_data_sub_pct <- alt_data[, .(Date, pair_usdt, pct_change)]
# convert to wide format 
alt_data_sub_pct <- spread(data = alt_data_sub_pct, key = &quot;pair_usdt&quot;, value = &quot;pct_change&quot;)
# clean column names
setnames(alt_data_sub_pct, gsub(&quot;USDT_&quot;, &quot;&quot;, colnames(alt_data_sub)))
# subset the data
btc_ltc <- alt_data_sub_pct[, .(Date, BTC, LTC)]
# add a year_month column
btc_ltc[, year_month := as.yearmon(Date)]
# calculate the correlation coefficient on montly basis
btc_ltc_2 <- btc_ltc[, cor(BTC, LTC), by = year_month]
# now plot the correlation coefficient as a function of month and year
plot(btc_ltc_2$year_month, btc_ltc_2$V1, xlab = &quot;Year-Month&quot;, main = &quot;Correlation Coeff. Between Daily Returns of BTC and LTC Over time&quot;
     , ylab = &quot;Correlation Coefficient&quot;, type = &quot;b&quot;, pch = 19, col = ifelse(btc_ltc_2$V1 > 0, &quot;blue&quot;, &quot;red&quot;)
     , ylim = c(-1, 1))" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># these steps are similar to the ones in the previous section, the only differnect is that now we are looking at the percentage change in price difference on daily basis instead of the actual price</span></span>
<span class="line"><span style="color: #88846F"># subset data, only keep the date, the pair, and the price</span></span>
<span class="line"><span style="color: #F8F8F2">alt_data_sub_pct </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> alt_data[, .(Date, pair_usdt, pct_change)]</span></span>
<span class="line"><span style="color: #88846F"># convert to wide format </span></span>
<span class="line"><span style="color: #F8F8F2">alt_data_sub_pct </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> spread(</span><span style="color: #FD971F">data</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> alt_data_sub_pct, </span><span style="color: #FD971F">key</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;pair_usdt&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">value</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;pct_change&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #88846F"># clean column names</span></span>
<span class="line"><span style="color: #F8F8F2">setnames(alt_data_sub_pct, </span><span style="color: #66D9EF">gsub</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;USDT_&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #66D9EF">colnames</span><span style="color: #F8F8F2">(alt_data_sub)))</span></span>
<span class="line"><span style="color: #88846F"># subset the data</span></span>
<span class="line"><span style="color: #F8F8F2">btc_ltc </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> alt_data_sub_pct[, .(Date, BTC, LTC)]</span></span>
<span class="line"><span style="color: #88846F"># add a year_month column</span></span>
<span class="line"><span style="color: #F8F8F2">btc_ltc[, year_month </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> as.yearmon(Date)]</span></span>
<span class="line"><span style="color: #88846F"># calculate the correlation coefficient on montly basis</span></span>
<span class="line"><span style="color: #F8F8F2">btc_ltc_2 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> btc_ltc[, </span><span style="color: #66D9EF">cor</span><span style="color: #F8F8F2">(BTC, LTC), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> year_month]</span></span>
<span class="line"><span style="color: #88846F"># now plot the correlation coefficient as a function of month and year</span></span>
<span class="line"><span style="color: #66D9EF">plot</span><span style="color: #F8F8F2">(btc_ltc_2</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">year_month, btc_ltc_2</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">V1, </span><span style="color: #FD971F">xlab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Year-Month&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">main</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Correlation Coeff. Between Daily Returns of BTC and LTC Over time&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">     , </span><span style="color: #FD971F">ylab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Correlation Coefficient&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">type</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;b&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">pch</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">19</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">ifelse</span><span style="color: #F8F8F2">(btc_ltc_2</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">V1 </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;blue&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;red&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">     , </span><span style="color: #FD971F">ylim</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">))</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="519" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_YozYWrtQM_qHssvavSNHFw-1.webp" alt="" class="wp-image-4922" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_YozYWrtQM_qHssvavSNHFw-1.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_YozYWrtQM_qHssvavSNHFw-1-500x313.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_YozYWrtQM_qHssvavSNHFw-1-150x94.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_YozYWrtQM_qHssvavSNHFw-1-768x481.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="62de">Interesting, the correlation between the percentage daily change of the prices for bitcoin and litecoin is much more on the positive side, we only have one month in which this correlation is negtive, barely negative. This is a lot different than what we saw between Bitcoin and DASH, but that was for the actual prices and not the daily returns. Let’s redo this plot but his time for the actual prices for bitcoin and litecoin, just like we did with DASH.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395828247070312px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# subset the data
btc_ltc_price <- alt_data_sub[, .(Date, BTC, LTC)]
# add a year_month column
btc_ltc_price[, year_month := as.yearmon(Date)]
# calculate the correlation coefficient on montly basis
btc_ltc_price_2 <- btc_ltc_price[, cor(BTC, LTC), by = year_month]
# now plot the correlation coefficient as a function of month and year
plot(btc_ltc_price_2$year_month, btc_ltc_price_2$V1, xlab = &quot;Year-Month&quot;, main = &quot;Correlation Coeff. Between BTC and Litecoin Over time&quot;
     , ylab = &quot;Correlation Coefficient&quot;, type = &quot;b&quot;, pch = 19, col = ifelse(btc_ltc_price_2$V1 > 0, &quot;blue&quot;, &quot;red&quot;)
     , ylim = c(-1, 1))" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># subset the data</span></span>
<span class="line"><span style="color: #F8F8F2">btc_ltc_price </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> alt_data_sub[, .(Date, BTC, LTC)]</span></span>
<span class="line"><span style="color: #88846F"># add a year_month column</span></span>
<span class="line"><span style="color: #F8F8F2">btc_ltc_price[, year_month </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> as.yearmon(Date)]</span></span>
<span class="line"><span style="color: #88846F"># calculate the correlation coefficient on montly basis</span></span>
<span class="line"><span style="color: #F8F8F2">btc_ltc_price_2 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> btc_ltc_price[, </span><span style="color: #66D9EF">cor</span><span style="color: #F8F8F2">(BTC, LTC), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> year_month]</span></span>
<span class="line"><span style="color: #88846F"># now plot the correlation coefficient as a function of month and year</span></span>
<span class="line"><span style="color: #66D9EF">plot</span><span style="color: #F8F8F2">(btc_ltc_price_2</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">year_month, btc_ltc_price_2</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">V1, </span><span style="color: #FD971F">xlab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Year-Month&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">main</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Correlation Coeff. Between BTC and Litecoin Over time&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">     , </span><span style="color: #FD971F">ylab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Correlation Coefficient&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">type</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;b&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">pch</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">19</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">ifelse</span><span style="color: #F8F8F2">(btc_ltc_price_2</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">V1 </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;blue&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;red&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">     , </span><span style="color: #FD971F">ylim</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">))</span></span></code></pre></div>



<p class="wp-block-paragraph" id="494c">Trends in the correlations of the daily return of bitcoin and litecoin on mothly basis, boy this is a mouth full, are very similar to those for the prices as we saw in the previous figure.</p>



<p class="wp-block-paragraph" id="2a55">In the next post we’ll do something more statistically sound, rolling correlations.</p>



<p class="wp-block-paragraph">Read More blogs in AnalyticaDSS Blogs here : <a href="https://analyticadss.com/blog">BLOGS</a></p>



<p class="wp-block-paragraph">Read More blogs in Medium : <a href="https://medium.com/@aousabdo">Medium Blogs</a></p>



<p class="wp-block-paragraph">Read More blogs in R-bloggers : <a href="https://www.r-bloggers.com/">https://www.r-bloggers.com</a></p>
<p>The post <a href="https://analyticadss.com/analyzing-cryptocurrency-markets-using-r-part-2/">Analyzing Crypto Market using R — Part 2</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Analyzing Crypto Markets using R — Part 1</title>
		<link>https://analyticadss.com/analyzing-cryptocurrency-markets-using-r-part-1/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Sat, 01 Dec 2018 10:19:38 +0000</pubDate>
				<category><![CDATA[Data Science]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[Bitcoin]]></category>
		<category><![CDATA[Cryptocurrency]]></category>
		<category><![CDATA[R]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=4897</guid>

					<description><![CDATA[<p>Downloading and Processing Crypto Data with R Analyzing crypto market Aous Abdo, WWW.ANALYTICADSS.COMAn interactive version of this post can be found here. No doubt that crypto currencies with all the promises they bring, both financially and otherwise, are only here to stay. As a data scientist interested in data and numbers, I thought it would be nice [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/analyzing-cryptocurrency-markets-using-r-part-1/">Analyzing Crypto Markets using R — Part 1</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading" id="e5da">Downloading and Processing Crypto Data with R</h2>



<p class="wp-block-paragraph">Analyzing crypto market</p>



<p class="wp-block-paragraph"><a href="https://medium.com/u/4f20dbfad286?source=post_page-----9e0d1bff7c63--------------------------------" rel="noreferrer noopener" target="_blank">Aous Abdo</a>, <a href="http://www.analyticadss.com/" rel="noreferrer noopener" target="_blank">WWW.ANALYTICADSS.COM</a><br>An interactive version of this post can be found <a href="https://analyticadss.com/adss_blog/crypto_notebook_part1.nb.html" rel="noreferrer noopener" target="_blank">here</a>.</p>



<p class="wp-block-paragraph" id="36bb">No doubt that crypto currencies with all the promises they bring, both financially and otherwise, are only here to stay. As a data scientist interested in data and numbers, I thought it would be nice to take a look at some crypto currencies with my favorite tool, <a href="https://cran.r-project.org/" target="_blank" rel="noreferrer noopener"><strong>R</strong></a>.</p>



<h2 class="wp-block-heading" id="65a0">R Libraries</h2>



<p class="wp-block-paragraph" id="33d5">Below is a list of <strong>R</strong> libraries we will be using to help us with our analysis. Not all of them are necessary but they all will make our life easier.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395828247070312px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="library(PoloniexR)
library(data.table)
library(lubridate)
library(Quandl)
library(plyr)
library(stringr)
library(ggplot2)
library(plotly)
library(janitor)
library(quantmod)
library(pryr)
library(corrplot)
library(PerformanceAnalytics)
library(tidyr)
library(MLmetrics)
library(readr)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(PoloniexR)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(data.table)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(lubridate)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(Quandl)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(plyr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(stringr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(ggplot2)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(plotly)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(janitor)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(quantmod)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(pryr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(corrplot)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(PerformanceAnalytics)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(tidyr)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(MLmetrics)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(readr)</span></span></code></pre></div>



<h2 class="wp-block-heading" id="f526">Getting the Data</h2>



<h2 class="wp-block-heading" id="2c1e">1. PoloniexR Package</h2>



<p class="wp-block-paragraph" id="69f7">The easiest way to get current and historical data for <strong>cyrpto </strong>currencies is by using the <strong><a href="https://cran.r-project.org/web/packages/PoloniexR/index.html" target="_blank" rel="noreferrer noopener">PoloniexR</a> </strong>developed by <em>Vermeir Jellen</em>. <em>Vermeir Jellen </em>gives a good tutorial on how to start with his package <a href="https://github.com/VermeirJellen/PoloniexR" target="_blank" rel="noreferrer noopener"><strong>here</strong></a>. The <a href="https://poloniex.com/exchange" target="_blank" rel="noreferrer noopener"><strong>Poloniex exchange</strong></a> includes many coins but not all. For missiong coins on Poloniex, one can scrape the <a href="http://rstudio-pubs-static.s3.amazonaws.com/www.coinmarketcap.com" target="_blank" rel="noreferrer noopener">coinmarketcap</a> page, an example is given here.</p>



<h2 class="wp-block-heading" id="34de">2. Quandl</h2>



<p class="wp-block-paragraph" id="11e7"><strong><a href="https://www.quandl.com/" target="_blank" rel="noreferrer noopener">Quandl</a> </strong>is my go to place for any financial data. Their free-tier API has lots of good data one can use. <strong>Quandl </strong>offers data from multiple exchanges. Locating crypto data on <strong>Quandl </strong>is not straight forward. After spending few hours on their site I found out that most of the crypto data can be found <a href="https://www.quandl.com/data/BITFINEX-Bitfinex" target="_blank" rel="noreferrer noopener">here</a></p>



<p class="wp-block-paragraph" id="5a62">First, let’s take a look at different exchange data for Bitcoin using <strong>Quandl</strong>. We will download and plot historical bitcoin data from the following exchanges Kraken, <strong>Coinbase</strong>, <strong>Bitstamp</strong>, and ITBIT</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395835876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# enable your Quandl API key
my_quandl_api_key <- read_file(&quot;../../quandl_api_key.txt&quot;)
Quandl.api_key(my_quandl_api_key)
# function to download quandl data
get_quandl_data <- function(data_source = &quot;BITFINEX&quot;
                            , pair = 'btcusd'
                            , ...){
  
  # make sure the user supplied the correct data_source
  if(toupper(data_source) != &quot;BITFINEX&quot;) stop(&quot;data source supplied is wrong...&quot;)
  # quandl is case sensitive, all codes have to be upper case
  pair <- toupper(pair)
  tmp <- NA
  try(tmp <- Quandl(code = toupper(paste(data_source, pair, sep = &quot;/&quot;)), ...), silent = TRUE)
  return(tmp)
}
# get btc data from different exchanges
  exchange_data <- list()
  
  exchanges <- c('KRAKENUSD','COINBASEUSD','BITSTAMPUSD','ITBITUSD')
  
  for (i in exchanges){
    exchange_data[[i]] <- Quandl(paste0('BCHARTS/', i))
  }" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># enable your Quandl API key</span></span>
<span class="line"><span style="color: #F8F8F2">my_quandl_api_key </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> read_file(</span><span style="color: #E6DB74">&quot;../../quandl_api_key.txt&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">Quandl.api_key(my_quandl_api_key)</span></span>
<span class="line"><span style="color: #88846F"># function to download quandl data</span></span>
<span class="line"><span style="color: #A6E22E">get_quandl_data</span><span style="color: #F8F8F2"> </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">function</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">data_source</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;BITFINEX&quot;</span></span>
<span class="line"><span style="color: #F8F8F2">                            , </span><span style="color: #FD971F">pair</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&#39;btcusd&#39;</span></span>
<span class="line"><span style="color: #F8F8F2">                            , </span><span style="color: #F92672">...</span><span style="color: #F8F8F2">){</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># make sure the user supplied the correct data_source</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">if</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">toupper</span><span style="color: #F8F8F2">(data_source) </span><span style="color: #F92672">!=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;BITFINEX&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #66D9EF">stop</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;data source supplied is wrong...&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># quandl is case sensitive, all codes have to be upper case</span></span>
<span class="line"><span style="color: #F8F8F2">  pair </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">toupper</span><span style="color: #F8F8F2">(pair)</span></span>
<span class="line"><span style="color: #F8F8F2">  tmp </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">NA</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #66D9EF">try</span><span style="color: #F8F8F2">(tmp </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> Quandl(</span><span style="color: #FD971F">code</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">toupper</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">paste</span><span style="color: #F8F8F2">(data_source, pair, </span><span style="color: #FD971F">sep</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;/&quot;</span><span style="color: #F8F8F2">)), </span><span style="color: #F92672">...</span><span style="color: #F8F8F2">), </span><span style="color: #FD971F">silent</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">return</span><span style="color: #F8F8F2">(tmp)</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span>
<span class="line"><span style="color: #88846F"># get btc data from different exchanges</span></span>
<span class="line"><span style="color: #F8F8F2">  exchange_data </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">list</span><span style="color: #F8F8F2">()</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  exchanges </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&#39;KRAKENUSD&#39;</span><span style="color: #F8F8F2">,</span><span style="color: #E6DB74">&#39;COINBASEUSD&#39;</span><span style="color: #F8F8F2">,</span><span style="color: #E6DB74">&#39;BITSTAMPUSD&#39;</span><span style="color: #F8F8F2">,</span><span style="color: #E6DB74">&#39;ITBITUSD&#39;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">for</span><span style="color: #F8F8F2"> (i </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> exchanges){</span></span>
<span class="line"><span style="color: #F8F8F2">    exchange_data[[i]] </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> Quandl(</span><span style="color: #66D9EF">paste0</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&#39;BCHARTS/&#39;</span><span style="color: #F8F8F2">, i))</span></span>
<span class="line"><span style="color: #F8F8F2">  }</span></span></code></pre></div>



<p class="wp-block-paragraph" id="935a">So We need to convert this list of BTC prices from different exchanges into a <strong>dataframe </strong>and put them all in one data frame so we can plot them.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.7048797607421875px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# put them all in one dataframe to plot in ggplot2
btc_usd <- do.call(&quot;rbind&quot;, exchange_data)
btc_usd$exchange <- row.names(btc_usd)
btc_usd <- as.data.table(btc_usd)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># put them all in one dataframe to plot in ggplot2</span></span>
<span class="line"><span style="color: #F8F8F2">btc_usd </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">do.call</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;rbind&quot;</span><span style="color: #F8F8F2">, exchange_data)</span></span>
<span class="line"><span style="color: #F8F8F2">btc_usd</span><span style="color: #F92672">$</span><span style="color: #F8F8F2">exchange </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">row.names</span><span style="color: #F8F8F2">(btc_usd)</span></span>
<span class="line"><span style="color: #F8F8F2">btc_usd </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> as.data.table(btc_usd)</span></span></code></pre></div>



<p class="wp-block-paragraph">We also need to do some minor cleaning, so let’s do that. We also need to get rid of rows of data with 0 weighted price.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# some data cleaning
btc_usd[, exchange := as.factor(str_extract(exchange, &quot;[A-Z]+&quot;))]
btc_usd <- clean_names(btc_usd)
btc_usd <- btc_usd[weighted_price > 0]
# set datatable key to be the date column
setkey(btc_usd, date)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># some data cleaning</span></span>
<span class="line"><span style="color: #F8F8F2">btc_usd[, exchange </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">as.factor</span><span style="color: #F8F8F2">(str_extract(exchange, </span><span style="color: #E6DB74">&quot;[A-Z]+&quot;</span><span style="color: #F8F8F2">))]</span></span>
<span class="line"><span style="color: #F8F8F2">btc_usd </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> clean_names(btc_usd)</span></span>
<span class="line"><span style="color: #F8F8F2">btc_usd </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> btc_usd[weighted_price </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">]</span></span>
<span class="line"><span style="color: #88846F"># set datatable key to be the date column</span></span>
<span class="line"><span style="color: #F8F8F2">setkey(btc_usd, date)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="4b66">Let’s take a look at the data table we just made.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="head(btc_usd)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">head</span><span style="color: #F8F8F2">(btc_usd)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="245" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_2OI9hXg9BGzha6OQ-KkIsg.webp" alt="" class="wp-image-4898" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_2OI9hXg9BGzha6OQ-KkIsg.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_2OI9hXg9BGzha6OQ-KkIsg-500x148.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_2OI9hXg9BGzha6OQ-KkIsg-150x44.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_2OI9hXg9BGzha6OQ-KkIsg-768x227.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="ed3e">The data includes 10 columns, the date, <strong>OCHL </strong>prices, volumes in USD and BTC, the weighted price, and the exchange. I wish I had bought me some <em>bitcoine </em>back in 2011!!!</p>



<p class="wp-block-paragraph" id="5c06">Now we’ll look at the price of bitcoin and color code it by exchange.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704833984375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="ggplot(btc_usd, aes(x = date, y = weighted_price, col = exchange)) + geom_line() + theme_bw()" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">ggplot(btc_usd, aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> weighted_price, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> exchange)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_line() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_bw()</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="497" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_pKZF3UsW8ntNBCpd7RhgTA.webp" alt="" class="wp-image-4899" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_pKZF3UsW8ntNBCpd7RhgTA.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_pKZF3UsW8ntNBCpd7RhgTA-500x300.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_pKZF3UsW8ntNBCpd7RhgTA-150x90.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_pKZF3UsW8ntNBCpd7RhgTA-768x461.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<h2 class="wp-block-heading" id="7d74">Arbitrage</h2>



<p class="wp-block-paragraph" id="0a27">It appears the prices of <strong>btc </strong>on different exchanges are fairly consisant. But this is an artifact in the figure since we are covering several orders of magnitudes during the timeline we selected. To better see any price differenes we need to zoon in on the figure. Let’s zoom in on, say the first month of <strong>2018</strong>, were we had the <strong>ATH </strong>for all coins. This will enable us to better see any differences in prices.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70489501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="ggplot(btc_usd[date >= ymd(&quot;2018-01-01&quot;) & date <= ymd(&quot;2018-01-31&quot;)], aes(x = date, y = weighted_price, col = exchange)) + geom_line() + theme_bw()" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">ggplot(btc_usd[date </span><span style="color: #F92672">>=</span><span style="color: #F8F8F2"> ymd(</span><span style="color: #E6DB74">&quot;2018-01-01&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">&</span><span style="color: #F8F8F2"> date </span><span style="color: #F92672"><=</span><span style="color: #F8F8F2"> ymd(</span><span style="color: #E6DB74">&quot;2018-01-31&quot;</span><span style="color: #F8F8F2">)], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> weighted_price, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> exchange)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_line() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_bw()</span></span></code></pre></div>



<p class="wp-block-paragraph">There are obvious differences in prices between the exchanges. Differences seem to vary over time as well. Actually it will be interesting to look at the maxiumum price differences as a function of time, let’s do that</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70489501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# first let's find the minimum price by date
btc_usd[, min_price := min(weighted_price), by = date]
# now we need to find the price difference between the price for each day and the minimum price for that day
# but since the price of bitcoin varies a lot for the time period under study, we need to normalize the price difference
# to do that we will just divide by the median price for each day
btc_usd[, price_diff := 100*(weighted_price - min_price)/median(weighted_price), by = (date)]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># first let&#39;s find the minimum price by date</span></span>
<span class="line"><span style="color: #F8F8F2">btc_usd[, min_price </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">min</span><span style="color: #F8F8F2">(weighted_price), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> date]</span></span>
<span class="line"><span style="color: #88846F"># now we need to find the price difference between the price for each day and the minimum price for that day</span></span>
<span class="line"><span style="color: #88846F"># but since the price of bitcoin varies a lot for the time period under study, we need to normalize the price difference</span></span>
<span class="line"><span style="color: #88846F"># to do that we will just divide by the median price for each day</span></span>
<span class="line"><span style="color: #F8F8F2">btc_usd[, price_diff </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">100</span><span style="color: #F92672">*</span><span style="color: #F8F8F2">(weighted_price </span><span style="color: #F92672">-</span><span style="color: #F8F8F2"> min_price)</span><span style="color: #F92672">/</span><span style="color: #66D9EF">median</span><span style="color: #F8F8F2">(weighted_price), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> (date)]</span></span></code></pre></div>



<p class="wp-block-paragraph">Now we have a new column which gives us the percentage of price differences for each day normalized to the median price for each day. Let’s take a look at the new table.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="tail(btc_usd)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">tail</span><span style="color: #F8F8F2">(btc_usd)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="574b">The reason I looked at the newer dates is that prior to <strong>2014 </strong>we only have data for one exchange, so all the price differences were <strong>0</strong>. Let’s take a look at the price differences as a function of time.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704833984375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# first we need to create a new data table with only the maxiumum prices per day
tmp <- btc_usd[, price_diff := max(price_diff), by = date]
# This will help us visualize overlapping points
MyGray <- rgb(t(col2rgb(&quot;black&quot;)), alpha=50, maxColorValue=255)
tmp[, plot(date, price_diff, pch=20, col = MyGray, xlab = &quot;Date&quot;, ylab = &quot;Maximum of Price Difference (%)&quot;)]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># first we need to create a new data table with only the maxiumum prices per day</span></span>
<span class="line"><span style="color: #F8F8F2">tmp </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> btc_usd[, price_diff </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">max</span><span style="color: #F8F8F2">(price_diff), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> date]</span></span>
<span class="line"><span style="color: #88846F"># This will help us visualize overlapping points</span></span>
<span class="line"><span style="color: #F8F8F2">MyGray </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">rgb</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">t</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">col2rgb</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;black&quot;</span><span style="color: #F8F8F2">)), </span><span style="color: #FD971F">alpha</span><span style="color: #F92672">=</span><span style="color: #AE81FF">50</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">maxColorValue</span><span style="color: #F92672">=</span><span style="color: #AE81FF">255</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">tmp[, </span><span style="color: #66D9EF">plot</span><span style="color: #F8F8F2">(date, price_diff, </span><span style="color: #FD971F">pch</span><span style="color: #F92672">=</span><span style="color: #AE81FF">20</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> MyGray, </span><span style="color: #FD971F">xlab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Date&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">ylab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Maximum of Price Difference (%)&quot;</span><span style="color: #F8F8F2">)]</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="450" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_OCmdv1nAXyXQXtM-V_p4rw.webp" alt="" class="wp-image-4900" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_OCmdv1nAXyXQXtM-V_p4rw.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_OCmdv1nAXyXQXtM-V_p4rw-500x272.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_OCmdv1nAXyXQXtM-V_p4rw-150x82.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_OCmdv1nAXyXQXtM-V_p4rw-768x417.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="8ee2">Let’s show the plot with log scale on <strong>y axis</strong>. Let’s also discard dates with zero price differences.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704833984375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="tmp[price_diff > 0, plot(date, price_diff, pch=20, col = MyGray, log = &quot;y&quot; , xlab = &quot;Date&quot;, ylab = &quot;Maximum of Price Difference (%)&quot;)]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">tmp[price_diff </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">, </span><span style="color: #66D9EF">plot</span><span style="color: #F8F8F2">(date, price_diff, </span><span style="color: #FD971F">pch</span><span style="color: #F92672">=</span><span style="color: #AE81FF">20</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">col</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> MyGray, </span><span style="color: #FD971F">log</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;y&quot;</span><span style="color: #F8F8F2"> , </span><span style="color: #FD971F">xlab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Date&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">ylab</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Maximum of Price Difference (%)&quot;</span><span style="color: #F8F8F2">)]</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="464" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_hW8T7SyWYrpL4FWYN7Y5sg.webp" alt="" class="wp-image-4901" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_hW8T7SyWYrpL4FWYN7Y5sg.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_hW8T7SyWYrpL4FWYN7Y5sg-500x280.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_hW8T7SyWYrpL4FWYN7Y5sg-150x84.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_hW8T7SyWYrpL4FWYN7Y5sg-768x430.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph">As one can see from the figure above, the bulk of the maximum difference in bitcoin prices between the different exchanges is in the <strong>0.5–2.0%</strong> range. It is also interesting to see that the differences in prices seem to have come down between 2014 and 2016, but they seem to go up starting in <strong>2017</strong>. Let’s fit a gam model to see what we get.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.703125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# we'll use ggplot and fit a gam smooth line
ggplot(tmp[price_diff > 0 ], aes(x = date, y = price_diff)) + geom_point(alpha = 0.2, shape = 16, size = 3, show.legend = FALSE) + scale_y_continuous(trans='log10') + geom_smooth(method = &quot;gam&quot;, formula = y ~ s(x, bs = &quot;cs&quot;)) + theme_bw() + xlab(&quot;Date&quot;) + ylab(&quot;Maximum of Price Difference (%)&quot;)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># we&#39;ll use ggplot and fit a gam smooth line</span></span>
<span class="line"><span style="color: #F8F8F2">ggplot(tmp[price_diff </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2"> ], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> date, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> price_diff)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_point(</span><span style="color: #FD971F">alpha</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0.2</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">shape</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">16</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">size</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">show.legend</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">FALSE</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> scale_y_continuous(</span><span style="color: #FD971F">trans</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&#39;log10&#39;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_smooth(</span><span style="color: #FD971F">method</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;gam&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">formula</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> y </span><span style="color: #F92672">~</span><span style="color: #F8F8F2"> s(x, </span><span style="color: #FD971F">bs</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;cs&quot;</span><span style="color: #F8F8F2">)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_bw() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> xlab(</span><span style="color: #E6DB74">&quot;Date&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Maximum of Price Difference (%)&quot;</span><span style="color: #F8F8F2">)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="492" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_6-thiHBpH1oNCkiqPmFugA.webp" alt="" class="wp-image-4902" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_6-thiHBpH1oNCkiqPmFugA.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_6-thiHBpH1oNCkiqPmFugA-500x297.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_6-thiHBpH1oNCkiqPmFugA-150x89.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_6-thiHBpH1oNCkiqPmFugA-768x456.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph">The regression line shows a hint of an over all downtrend from <strong>2014 </strong>to mid <strong>2016</strong>, except for an uptrend for few months in late <strong>2015</strong>. The trend seems to have gone up in mid to late <strong>2017</strong>, and again we see a downword movement in price differences starting in December of 2017. This can be seen better in the <em>box-plot</em> figure below.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.703125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="tmp[, month_year := format(as.Date(date), &quot;%Y-%m&quot;)]
ggplot(tmp[price_diff > 0], aes(x = month_year, y = price_diff)) + geom_boxplot() + scale_y_continuous(trans='log10') + xlab(&quot;Date (Year-Month)&quot;) + ylab(&quot;Maximum of Price Difference (%)&quot;) + theme_bw() + theme(axis.text.x = element_text(angle = 90, hjust = 1))" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">tmp[, month_year </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">format</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">as.Date</span><span style="color: #F8F8F2">(date), </span><span style="color: #E6DB74">&quot;%Y-%m&quot;</span><span style="color: #F8F8F2">)]</span></span>
<span class="line"><span style="color: #F8F8F2">ggplot(tmp[price_diff </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">], aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> month_year, </span><span style="color: #FD971F">y</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> price_diff)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_boxplot() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> scale_y_continuous(</span><span style="color: #FD971F">trans</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&#39;log10&#39;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> xlab(</span><span style="color: #E6DB74">&quot;Date (Year-Month)&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Maximum of Price Difference (%)&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_bw() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme(</span><span style="color: #FD971F">axis.text.x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> element_text(</span><span style="color: #FD971F">angle</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">90</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">hjust</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">))</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="499" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_KKc4nGtHQg6RBCxzoKkaMw.webp" alt="" class="wp-image-4903" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_KKc4nGtHQg6RBCxzoKkaMw.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_KKc4nGtHQg6RBCxzoKkaMw-500x301.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_KKc4nGtHQg6RBCxzoKkaMw-150x90.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_KKc4nGtHQg6RBCxzoKkaMw-768x463.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="60f8">The <em>box-plot</em> figure above shows the variation of maximum differences in prices as a function of time. On the <strong>x-axis</strong> I grouped dates by month since anything less than a one-month period will result in congested figure.</p>



<p class="wp-block-paragraph" id="f970"><strong>Okay</strong>, now let’s find out which of the exchanges contribute the most to these price difference. That is, we are trying to determine which exchanges are constantly selling bitcoin higher, or lower, than the rest of the exchanges. We need to pull some numbers as below, and then we’ll make a bar plot to show the leading exchanges in each category.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395843505859375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# add two columns to our data table which will contain the minimum and maximum prices
tmp[, `:=`(day_max = max(weighted_price), day_min = min(weighted_price)), by = date]
# now put only the columns we care about in a new data.table
tmp2 <- tmp[price_diff > 0 , .(date, exchange, weighted_price, day_min, day_max)]
# notice how we excluded days with no price difference
# now we only want to keep the rows with the maximum and minimum daily prices
tmp2 <- tmp2[weighted_price == day_min | weighted_price == day_max]
# now we'll add a new column designating the price as being the minimum or maximum
tmp2[, max_min := ifelse(weighted_price == day_min, &quot;min&quot;, &quot;max&quot;)]
# clean the name of the exchange
tmp2[, exchange := gsub(&quot;USD&quot;, &quot;&quot;, exchange)]
# now we'll add a new column containing the exchange name and the min_max column
tmp2[, max_min_exchange := paste(max_min, exchange, sep = &quot;-&quot;)]" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># add two columns to our data table which will contain the minimum and maximum prices</span></span>
<span class="line"><span style="color: #F8F8F2">tmp[, `:=`(</span><span style="color: #FD971F">day_max</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">max</span><span style="color: #F8F8F2">(weighted_price), </span><span style="color: #FD971F">day_min</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">min</span><span style="color: #F8F8F2">(weighted_price)), </span><span style="color: #FD971F">by</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> date]</span></span>
<span class="line"><span style="color: #88846F"># now put only the columns we care about in a new data.table</span></span>
<span class="line"><span style="color: #F8F8F2">tmp2 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> tmp[price_diff </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2"> , .(date, exchange, weighted_price, day_min, day_max)]</span></span>
<span class="line"><span style="color: #88846F"># notice how we excluded days with no price difference</span></span>
<span class="line"><span style="color: #88846F"># now we only want to keep the rows with the maximum and minimum daily prices</span></span>
<span class="line"><span style="color: #F8F8F2">tmp2 </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> tmp2[weighted_price </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> day_min </span><span style="color: #F92672">|</span><span style="color: #F8F8F2"> weighted_price </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> day_max]</span></span>
<span class="line"><span style="color: #88846F"># now we&#39;ll add a new column designating the price as being the minimum or maximum</span></span>
<span class="line"><span style="color: #F8F8F2">tmp2[, max_min </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">ifelse</span><span style="color: #F8F8F2">(weighted_price </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> day_min, </span><span style="color: #E6DB74">&quot;min&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;max&quot;</span><span style="color: #F8F8F2">)]</span></span>
<span class="line"><span style="color: #88846F"># clean the name of the exchange</span></span>
<span class="line"><span style="color: #F8F8F2">tmp2[, exchange </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">gsub</span><span style="color: #F8F8F2">(</span><span style="color: #E6DB74">&quot;USD&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #E6DB74">&quot;&quot;</span><span style="color: #F8F8F2">, exchange)]</span></span>
<span class="line"><span style="color: #88846F"># now we&#39;ll add a new column containing the exchange name and the min_max column</span></span>
<span class="line"><span style="color: #F8F8F2">tmp2[, max_min_exchange </span><span style="color: #F92672">:=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">paste</span><span style="color: #F8F8F2">(max_min, exchange, </span><span style="color: #FD971F">sep</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;-&quot;</span><span style="color: #F8F8F2">)]</span></span></code></pre></div>



<p class="wp-block-paragraph">In the above chunk of code we created a new table which contains the maximum and minimum prices for each day. The table also contains a categorical column showing to which exchange this <strong>max/min</strong> price belong, and if the price was a maxima or a minima. Before we plot the table above, let’s have a quick look at it.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.704864501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="head(tmp2)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #66D9EF">head</span><span style="color: #F8F8F2">(tmp2)</span></span></code></pre></div>



<p class="wp-block-paragraph"></p>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="251" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_yKe1-J73s4kQWcCYbVKIgw.webp" alt="" class="wp-image-4904" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_yKe1-J73s4kQWcCYbVKIgw.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_yKe1-J73s4kQWcCYbVKIgw-500x152.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_yKe1-J73s4kQWcCYbVKIgw-150x45.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_yKe1-J73s4kQWcCYbVKIgw-768x233.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="f700">The <strong>max_min_exchange</strong> column contains all the data we need, so let’s make a <strong>barplot </strong>of this variable, we’ll color the <strong>barplot </strong>by the <strong>max_min</strong> criteria shown in <strong>max_min</strong> column</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:7.70489501953125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# now make a barplot 
ggplot(tmp2, aes(x = max_min_exchange, fill = max_min)) + geom_bar() + theme_bw() + theme(axis.text.x = element_text(angle = 90, hjust = 1)) + ggtitle(&quot;Exchanges with the highest and lowest price differences in Bitcoin&quot;) + ylab(&quot;Frequency&quot;) + xlab(&quot;Exchange&quot;) + scale_fill_discrete(name = &quot;BTC Price Diff. Type&quot;)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># now make a barplot </span></span>
<span class="line"><span style="color: #F8F8F2">ggplot(tmp2, aes(</span><span style="color: #FD971F">x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> max_min_exchange, </span><span style="color: #FD971F">fill</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> max_min)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> geom_bar() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme_bw() </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> theme(</span><span style="color: #FD971F">axis.text.x</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> element_text(</span><span style="color: #FD971F">angle</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">90</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">hjust</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">)) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ggtitle(</span><span style="color: #E6DB74">&quot;Exchanges with the highest and lowest price differences in Bitcoin&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> ylab(</span><span style="color: #E6DB74">&quot;Frequency&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> xlab(</span><span style="color: #E6DB74">&quot;Exchange&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> scale_fill_discrete(</span><span style="color: #FD971F">name</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;BTC Price Diff. Type&quot;</span><span style="color: #F8F8F2">)</span></span></code></pre></div>


<div class="wp-block-image">
<figure class="aligncenter size-full"><img loading="lazy" decoding="async" width="828" height="500" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_L0E0OJPfWfCxcc1DKzVmiw.webp" alt="" class="wp-image-4905" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_L0E0OJPfWfCxcc1DKzVmiw.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_L0E0OJPfWfCxcc1DKzVmiw-500x302.webp 500w, https://analyticadss.com/wp-content/uploads/2022/12/1_L0E0OJPfWfCxcc1DKzVmiw-150x91.webp 150w, https://analyticadss.com/wp-content/uploads/2022/12/1_L0E0OJPfWfCxcc1DKzVmiw-768x464.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /></figure>
</div>


<p class="wp-block-paragraph" id="8183">This is interesting, <strong>Kraken</strong> seems to be the exchange with the most frequent highest prices for <strong>bitcoin</strong>. On the other hand, Bitstamp seems to be the one with the most frequent lowest prices among exchanges. So if you want to do <a href="https://www.investopedia.com/terms/a/arbitrage.asp" target="_blank" rel="noreferrer noopener">arbitrage</a> your best bit is to buy on <strong>Bitstamp </strong>and sell on <strong>Kraken</strong>.</p>



<p class="wp-block-paragraph"><a href="https://medium.com/tag/bitcoin?source=post_page-----9e0d1bff7c63---------------bitcoin-----------------"></a></p>



<p class="wp-block-paragraph">Read More blogs in AnalyticaDSS Blogs here : <a href="https://analyticadss.com/blog">BLOGS</a></p>



<p class="wp-block-paragraph">Read More blogs in Medium : <a href="https://medium.com/@aousabdo">Medium Blogs</a></p>



<p class="wp-block-paragraph">Read More blogs in R-bloggers : <a href="https://www.r-bloggers.com/">https://www.r-bloggers.com</a></p>
<p>The post <a href="https://analyticadss.com/analyzing-cryptocurrency-markets-using-r-part-1/">Analyzing Crypto Markets using R — Part 1</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>Conway’s Game of Life With Examples in R and Python</title>
		<link>https://analyticadss.com/conways-game-of-life-with-examples-in-r-and-python/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Wed, 17 Jan 2018 15:03:07 +0000</pubDate>
				<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[Technology]]></category>
		<category><![CDATA[Game Of Life]]></category>
		<category><![CDATA[Games]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=4805</guid>

					<description><![CDATA[<p>“Conway’s Game of Life With Examples in R and Python” The Game of Life, also known as the “Conway’s Game of Life,” is a cellular automaton invented by mathematician John Horton Conway in 1970. It is a zero-player game, meaning that once the game is set up, it runs on its own, and there is [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/conways-game-of-life-with-examples-in-r-and-python/">Conway’s Game of Life With Examples in R and Python</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph">“Conway’s Game of Life With Examples in R and Python”</p>



<p class="wp-block-paragraph" id="2c26">The Game of Life, also known as the “Conway’s Game of Life,” is a cellular automaton invented by mathematician John Horton Conway in 1970. It is a zero-player game, meaning that once the game is set up, it runs on its own, and there is no further input from a player.</p>



<p class="wp-block-paragraph" id="2239">The game is played on a grid of cells, each of which can be in one of two states: “alive” or “dead.” The state of each cell in the grid is determined by the state of the cells surrounding it, according to a set of rules. The game proceeds in a series of “generations,” with the state of each cell in the next generation being determined by the state of the cells in the current generation.</p>



<p class="wp-block-paragraph" id="51b7">The game has been used to study a wide range of topics in mathematics and computer science, including patterns, self-organization, and complexity. It has also been used as a tool for exploring artificial life and artificial intelligence.</p>



<h2 class="wp-block-heading" id="9455">Game Rules</h2>



<p class="wp-block-paragraph" id="8b79">The game rules are as follows:</p>



<ol class="wp-block-list">
<li>Any live cell with fewer than two live neighbors dies, as if by underpopulation.</li>



<li>Any live cell with two or three live neighbors lives on to the next generation.</li>



<li>Any live cell with more than three live neighbors dies, as if by overpopulation.</li>



<li>Any dead cell with exactly three live neighbors becomes a live cell, as if by reproduction.</li>
</ol>



<p class="wp-block-paragraph" id="31df">These rules are applied to each cell in the grid simultaneously, with the state of the cells in the next generation being determined based on the state of the cells in the current generation. The game continues in this way, with the state of the cells being updated in each generation.</p>



<p class="wp-block-paragraph" id="3f76">The Game of Life can be implemented in many different ways, with the rules being applied to a two-dimensional grid, or to a one-dimensional “tape,” or even to a three-dimensional space. There are also many variations of the rules that have been explored, with different sets of rules leading to different patterns and behaviors in the game.</p>



<p class="wp-block-paragraph" id="8bb7">The Game of Life is capable of producing a wide range of patterns, depending on the initial configuration of the cells and the specific rules that are being used. Some patterns remain stable over time, while others may evolve and change over the course of the game.</p>



<h2 class="wp-block-heading" id="a0e9">Game Patterns</h2>


<div class="wp-block-image">
<figure class="aligncenter size-full is-resized"><img loading="lazy" decoding="async" loading="lazy" src="https://analyticadss.com/wp-content/uploads/2022/12/1_y7LnENgBdKekN6PrxE78Xg.webp" alt="" class="wp-image-4807" width="828" height="877" srcset="https://analyticadss.com/wp-content/uploads/2022/12/1_y7LnENgBdKekN6PrxE78Xg.webp 828w, https://analyticadss.com/wp-content/uploads/2022/12/1_y7LnENgBdKekN6PrxE78Xg-283x300.webp 283w, https://analyticadss.com/wp-content/uploads/2022/12/1_y7LnENgBdKekN6PrxE78Xg-768x813.webp 768w" sizes="auto, (max-width: 828px) 100vw, 828px" /><figcaption class="wp-element-caption">Few examples of patterns that can appear in the Game of Life<br></figcaption></figure>
</div>


<p class="wp-block-paragraph" id="c715">Here are a few examples of patterns that can appear in the Game of Life:</p>



<ul class="wp-block-list">
<li>Still lifes: These are patterns that remain stable over time and do not change from one generation to the next. Examples of still lifes include blocks, beehives, and loafs.</li>



<li>Oscillators: These are patterns that repeat themselves over time, cycling through a fixed set of states. Examples of oscillators include blinkers, toads, and pulsars.</li>



<li>Spaceships: These are patterns that move across the grid, leaving behind a “trail” of cells as they go. Spaceships can move in any of the four cardinal directions, and their movement may be periodic or aperiodic.</li>



<li>Gliders: These are a type of spaceship that move diagonally across the grid, leaving behind a distinctive “trail” of cells. Gliders are one of the simplest and most well-known spaceships in the Game of Life.</li>



<li>Guns: These are patterns that produce an endless stream of spaceships or other patterns. The first gun was discovered in 1970 by Bill Gosper, and many more have been found since.</li>



<li>Patterns with complex behavior: Some patterns in the Game of Life exhibit behavior that is difficult to predict or understand. These patterns may exhibit seemingly random behavior, or they may exhibit complex, self-organizing behavior.</li>
</ul>



<p class="wp-block-paragraph" id="3417">These are just a few examples of the patterns that can appear in the Game of Life. The game is capable of producing a wide range of patterns and behaviors, and new patterns are still being discovered today.</p>



<h2 class="wp-block-heading" id="10a2">Real-Life Examples</h2>



<p class="wp-block-paragraph" id="4f78">The Game of Life has inspired a number of real-world applications and has been used to model a wide range of systems and phenomena in fields such as biology, physics, and computer science. Here are a few examples of how the Game of Life has been used in real life:</p>



<ul class="wp-block-list">
<li>Biology: The Game of Life has been used to model the behavior of cellular automata, including the growth and division of cells. It has also been used to study patterns in the distribution of species in ecosystems and the spread of epidemics.</li>



<li>Physics: The Game of Life has been used to model the behavior of physical systems, such as the formation of patterns in crystals and the behavior of fluids.</li>



<li>Computer science: The Game of Life has been used as a test bed for exploring algorithms and data structures, and it has inspired the development of a number of computational models and techniques.</li>



<li>Art and design: The Game of Life has inspired a number of artistic and design projects, including digital art, installations, and interactive exhibits.</li>



<li>Education: The Game of Life has been used as a tool for teaching concepts in mathematics, computer science, and other fields, and it has been incorporated into educational software and curricula.</li>
</ul>



<p class="wp-block-paragraph" id="58cc">These are just a few examples of how the Game of Life has been used in real life. The game’s simplicity and versatility have made it a popular tool for studying a wide range of systems and phenomena.</p>



<h2 class="wp-block-heading" id="0690">Game Simulation in R</h2>



<p class="wp-block-paragraph" id="daad">The script below simulates the Game of Life and produces a plot of the evolution of the cells over time. The <code>plotly</code> package is used to create an interactive heatmap, with “dead” cells being shown in darker shades and “alive” cells being shown in lighter shades. The x-axis of the plot represents the generation, and the y-axis represents the cell.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.39581298828125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# Install the plotly and furrr packages if they are not already installed
# install.packages(&quot;plotly&quot;)
# install.packages(&quot;furrr&quot;)

library(plotly)
library(furrr)

# Set up the grid for the game
grid <- matrix(sample(c(0, 1), 100*100, replace = TRUE), nrow = 100)

# Initialize a list to store the state of the cells at each generation
grid_history <- rep(list(grid), generations)

# Simulate the game using the furrr package to parallelize the simulation
plan(multisession)
grid_history <- future_map(1:generations, function(i) {
  # Get the current state of the cells
  current_state <- grid_history[[i]]
  
  # Initialize the next state of the cells
  next_state <- matrix(0, nrow = nrow(current_state), ncol = ncol(current_state))
  
  # Iterate over each cell in the grid
  for (x in 1:nrow(current_state)) {
    for (y in 1:ncol(current_state)) {
      # Get the number of live neighbors for the current cell
      neighbors <- sum(current_state[max(x-1, 1):min(x+1, nrow(current_state)), 
                                     max(y-1, 1):min(y+1, ncol(current_state))]) - current_state[x, y]
      
      # Apply the rules of the Game of Life to determine the next state of the cell
      if (current_state[x, y] == 1) {
        if (neighbors < 2 || neighbors > 3) {
          next_state[x, y] <- 0
        } else {
          next_state[x, y] <- 1
        }
      } else {
        if (neighbors == 3) {
          next_state[x, y] <- 1
        } else {
          next_state[x, y] <- 0
        }
      }
    }
  }
  
  # Update the state of the cells
  grid <- next_state
  
  # Store the state of the cells at the current generation
  grid
})

# Plot the evolution of the cells over time using the plotly package
plot_ly(z = grid_history, colorscale = &quot;Blackbody&quot;, type = &quot;heatmapgl&quot;) %>%
  layout(xaxis = list(title = &quot;Generation&quot;), yaxis = list(title = &quot;Cell&quot;))" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># Install the plotly and furrr packages if they are not already installed</span></span>
<span class="line"><span style="color: #88846F"># install.packages(&quot;plotly&quot;)</span></span>
<span class="line"><span style="color: #88846F"># install.packages(&quot;furrr&quot;)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(plotly)</span></span>
<span class="line"><span style="color: #66D9EF">library</span><span style="color: #F8F8F2">(furrr)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Set up the grid for the game</span></span>
<span class="line"><span style="color: #F8F8F2">grid </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">matrix</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">sample</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">c</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">), </span><span style="color: #AE81FF">100</span><span style="color: #F92672">*</span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">replace</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">TRUE</span><span style="color: #F8F8F2">), </span><span style="color: #FD971F">nrow</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Initialize a list to store the state of the cells at each generation</span></span>
<span class="line"><span style="color: #F8F8F2">grid_history </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">rep</span><span style="color: #F8F8F2">(</span><span style="color: #66D9EF">list</span><span style="color: #F8F8F2">(grid), generations)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Simulate the game using the furrr package to parallelize the simulation</span></span>
<span class="line"><span style="color: #F8F8F2">plan(multisession)</span></span>
<span class="line"><span style="color: #F8F8F2">grid_history </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> future_map(</span><span style="color: #AE81FF">1</span><span style="color: #F92672">:</span><span style="color: #F8F8F2">generations, </span><span style="color: #F92672">function</span><span style="color: #F8F8F2">(i) {</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># Get the current state of the cells</span></span>
<span class="line"><span style="color: #F8F8F2">  current_state </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> grid_history[[i]]</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># Initialize the next state of the cells</span></span>
<span class="line"><span style="color: #F8F8F2">  next_state </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">matrix</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">nrow</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">nrow</span><span style="color: #F8F8F2">(current_state), </span><span style="color: #FD971F">ncol</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">ncol</span><span style="color: #F8F8F2">(current_state))</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># Iterate over each cell in the grid</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #F92672">for</span><span style="color: #F8F8F2"> (x </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span><span style="color: #F92672">:</span><span style="color: #66D9EF">nrow</span><span style="color: #F8F8F2">(current_state)) {</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #F92672">for</span><span style="color: #F8F8F2"> (y </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span><span style="color: #F92672">:</span><span style="color: #66D9EF">ncol</span><span style="color: #F8F8F2">(current_state)) {</span></span>
<span class="line"><span style="color: #F8F8F2">      </span><span style="color: #88846F"># Get the number of live neighbors for the current cell</span></span>
<span class="line"><span style="color: #F8F8F2">      neighbors </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">sum</span><span style="color: #F8F8F2">(current_state[</span><span style="color: #66D9EF">max</span><span style="color: #F8F8F2">(x</span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">)</span><span style="color: #F92672">:</span><span style="color: #66D9EF">min</span><span style="color: #F8F8F2">(x</span><span style="color: #F92672">+</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #66D9EF">nrow</span><span style="color: #F8F8F2">(current_state)), </span></span>
<span class="line"><span style="color: #F8F8F2">                                     </span><span style="color: #66D9EF">max</span><span style="color: #F8F8F2">(y</span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">)</span><span style="color: #F92672">:</span><span style="color: #66D9EF">min</span><span style="color: #F8F8F2">(y</span><span style="color: #F92672">+</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #66D9EF">ncol</span><span style="color: #F8F8F2">(current_state))]) </span><span style="color: #F92672">-</span><span style="color: #F8F8F2"> current_state[x, y]</span></span>
<span class="line"><span style="color: #F8F8F2">      </span></span>
<span class="line"><span style="color: #F8F8F2">      </span><span style="color: #88846F"># Apply the rules of the Game of Life to determine the next state of the cell</span></span>
<span class="line"><span style="color: #F8F8F2">      </span><span style="color: #F92672">if</span><span style="color: #F8F8F2"> (current_state[x, y] </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">) {</span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #F92672">if</span><span style="color: #F8F8F2"> (neighbors </span><span style="color: #F92672"><</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">||</span><span style="color: #F8F8F2"> neighbors </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">) {</span></span>
<span class="line"><span style="color: #F8F8F2">          next_state[x, y] </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span></span>
<span class="line"><span style="color: #F8F8F2">        } </span><span style="color: #F92672">else</span><span style="color: #F8F8F2"> {</span></span>
<span class="line"><span style="color: #F8F8F2">          next_state[x, y] </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span></span>
<span class="line"><span style="color: #F8F8F2">        }</span></span>
<span class="line"><span style="color: #F8F8F2">      } </span><span style="color: #F92672">else</span><span style="color: #F8F8F2"> {</span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #F92672">if</span><span style="color: #F8F8F2"> (neighbors </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">) {</span></span>
<span class="line"><span style="color: #F8F8F2">          next_state[x, y] </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span></span>
<span class="line"><span style="color: #F8F8F2">        } </span><span style="color: #F92672">else</span><span style="color: #F8F8F2"> {</span></span>
<span class="line"><span style="color: #F8F8F2">          next_state[x, y] </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0</span></span>
<span class="line"><span style="color: #F8F8F2">        }</span></span>
<span class="line"><span style="color: #F8F8F2">      }</span></span>
<span class="line"><span style="color: #F8F8F2">    }</span></span>
<span class="line"><span style="color: #F8F8F2">  }</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># Update the state of the cells</span></span>
<span class="line"><span style="color: #F8F8F2">  grid </span><span style="color: #F92672"><-</span><span style="color: #F8F8F2"> next_state</span></span>
<span class="line"><span style="color: #F8F8F2">  </span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #88846F"># Store the state of the cells at the current generation</span></span>
<span class="line"><span style="color: #F8F8F2">  grid</span></span>
<span class="line"><span style="color: #F8F8F2">})</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Plot the evolution of the cells over time using the plotly package</span></span>
<span class="line"><span style="color: #F8F8F2">plot_ly(</span><span style="color: #FD971F">z</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> grid_history, </span><span style="color: #FD971F">colorscale</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Blackbody&quot;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">type</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;heatmapgl&quot;</span><span style="color: #F8F8F2">) </span><span style="color: #F92672">%>%</span></span>
<span class="line"><span style="color: #F8F8F2">  </span><span style="color: #66D9EF">layout</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">xaxis</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">list</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">title</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Generation&quot;</span><span style="color: #F8F8F2">), </span><span style="color: #FD971F">yaxis</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">list</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">title</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> </span><span style="color: #E6DB74">&quot;Cell&quot;</span><span style="color: #F8F8F2">))</span></span></code></pre></div>



<p class="wp-block-paragraph"></p>



<h2 class="wp-block-heading" id="4a90">Game Simulation in Python</h2>



<p class="wp-block-paragraph" id="909d">The script below simulates the Game of Life and produces a plot of the evolution of the cells over time. The <code>matplotlib</code> package is used to create a heatmap, with “dead” cells being shown in darker shades and “alive” cells being shown in lighter shades. The x-axis of the plot represents the generation, and the y-axis represents the cell.</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395835876464844px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="import numpy as np
import matplotlib.pyplot as plt

# Set up the grid for the game
grid = np.random.choice([0, 1], size=(100, 100))

# Initialize a list to store the state of the cells at each generation
grid_history = [grid]

# Simulate the game
for i in range(generations):
    # Get the current state of the cells
    current_state = grid_history[i]

    # Initialize the next state of the cells
    next_state = np.zeros((100, 100))

    # Iterate over each cell in the grid
    for x in range(100):
        for y in range(100):
            # Get the number of live neighbors for the current cell
            neighbors = (current_state[max(x-1, 0):min(x+2, 100), max(y-1, 0):min(y+2, 100)]).sum() - current_state[x, y]

            # Apply the rules of the Game of Life to determine the next state of the cell
            if current_state[x, y] == 1:
                if neighbors < 2 or neighbors > 3:
                    next_state[x, y] = 0
                else:
                    next_state[x, y] = 1
            else:
                if neighbors == 3:
                    next_state[x, y] = 1
                else:
                    next_state[x, y] = 0

    # Update the state of the cells
    grid = next_state

    # Store the state of the cells at the current generation
    grid_history.append(grid)

# Plot the evolution of the cells over time using the matplotlib package
plt.imshow(grid_history, cmap=&quot;Greys&quot;)
plt.xlabel(&quot;Generation&quot;)
plt.ylabel(&quot;Cell&quot;)
plt.show()" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F8F8F2">import numpy as np</span></span>
<span class="line"><span style="color: #F8F8F2">import matplotlib.pyplot as plt</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Set up the grid for the game</span></span>
<span class="line"><span style="color: #FD971F">grid</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> np.random.choice([</span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">], </span><span style="color: #FD971F">size</span><span style="color: #F92672">=</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">))</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Initialize a list to store the state of the cells at each generation</span></span>
<span class="line"><span style="color: #FD971F">grid_history</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> [grid]</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Simulate the game</span></span>
<span class="line"><span style="color: #F8F8F2">for i </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">range</span><span style="color: #F8F8F2">(generations)</span><span style="color: #F92672">:</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># Get the current state of the cells</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">current_state</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> grid_history[i]</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># Initialize the next state of the cells</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">next_state</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> np.zeros((</span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">))</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># Iterate over each cell in the grid</span></span>
<span class="line"><span style="color: #F8F8F2">    for x </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">range</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">)</span><span style="color: #F92672">:</span></span>
<span class="line"><span style="color: #F8F8F2">        for y </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">range</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">)</span><span style="color: #F92672">:</span></span>
<span class="line"><span style="color: #F8F8F2">            </span><span style="color: #88846F"># Get the number of live neighbors for the current cell</span></span>
<span class="line"><span style="color: #F8F8F2">            </span><span style="color: #FD971F">neighbors</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> (current_state[</span><span style="color: #66D9EF">max</span><span style="color: #F8F8F2">(x</span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">)</span><span style="color: #F92672">:</span><span style="color: #66D9EF">min</span><span style="color: #F8F8F2">(x</span><span style="color: #F92672">+</span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">), </span><span style="color: #66D9EF">max</span><span style="color: #F8F8F2">(y</span><span style="color: #F92672">-</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">)</span><span style="color: #F92672">:</span><span style="color: #66D9EF">min</span><span style="color: #F8F8F2">(y</span><span style="color: #F92672">+</span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">)]).sum() </span><span style="color: #F92672">-</span><span style="color: #F8F8F2"> current_state[x, y]</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">            </span><span style="color: #88846F"># Apply the rules of the Game of Life to determine the next state of the cell</span></span>
<span class="line"><span style="color: #F8F8F2">            if current_state[x, y] </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">1</span><span style="color: #F92672">:</span></span>
<span class="line"><span style="color: #F8F8F2">                if neighbors </span><span style="color: #F92672"><</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2"> or neighbors </span><span style="color: #F92672">></span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">3</span><span style="color: #F92672">:</span></span>
<span class="line"><span style="color: #F8F8F2">                    next_state[x, y] = </span><span style="color: #AE81FF">0</span></span>
<span class="line"><span style="color: #F8F8F2">                </span><span style="color: #F92672">else:</span></span>
<span class="line"><span style="color: #F8F8F2">                    next_state[x, y] = </span><span style="color: #AE81FF">1</span></span>
<span class="line"><span style="color: #F8F8F2">            </span><span style="color: #F92672">else:</span></span>
<span class="line"><span style="color: #F8F8F2">                if neighbors </span><span style="color: #F92672">==</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">3</span><span style="color: #F92672">:</span></span>
<span class="line"><span style="color: #F8F8F2">                    next_state[x, y] = </span><span style="color: #AE81FF">1</span></span>
<span class="line"><span style="color: #F8F8F2">                </span><span style="color: #F92672">else:</span></span>
<span class="line"><span style="color: #F8F8F2">                    next_state[x, y] = </span><span style="color: #AE81FF">0</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># Update the state of the cells</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">grid</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> next_state</span></span>
<span class="line"></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># Store the state of the cells at the current generation</span></span>
<span class="line"><span style="color: #F8F8F2">    grid_history.append(grid)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># Plot the evolution of the cells over time using the matplotlib package</span></span>
<span class="line"><span style="color: #F8F8F2">plt.imshow(grid_history, </span><span style="color: #FD971F">cmap</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&quot;Greys&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">plt.xlabel(</span><span style="color: #E6DB74">&quot;Generation&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">plt.ylabel(</span><span style="color: #E6DB74">&quot;Cell&quot;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">plt.show()</span></span></code></pre></div>



<h2 class="wp-block-heading">In conclusion,</h2>



<p class="wp-block-paragraph">the Game of Life, also known as the cellular automaton, is a classic example of a simple mathematical model that can produce complex and fascinating patterns. In the game, a grid of cells is initialized with a certain number of “alive” cells, and the state of the cells at each generation is determined by a set of rules that depend on the number of live neighbors each cell has. Depending on the initial state of the cells and the rules applied, the game can produce a wide range of patterns, from stable configurations to chaotic patterns.</p>



<p class="wp-block-paragraph">Read More blogs in AnalyticaDSS Blogs here : <a href="https://analyticadss.com/blog">BLOGS</a></p>



<p class="wp-block-paragraph">Read More blogs in Medium : <a href="https://medium.com/@aousabdo">Medium Blogs</a></p>



<p class="wp-block-paragraph">Read More blogs in R-bloggers : <a href="https://www.r-bloggers.com/">https://www.r-bloggers.com</a></p>
<p>The post <a href="https://analyticadss.com/conways-game-of-life-with-examples-in-r-and-python/">Conway’s Game of Life With Examples in R and Python</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>What is Reinforcement Learning?</title>
		<link>https://analyticadss.com/what-is-reinforcement-learning/</link>
		
		<dc:creator><![CDATA[Aous Abdo]]></dc:creator>
		<pubDate>Wed, 01 Nov 2017 14:23:29 +0000</pubDate>
				<category><![CDATA[Artificial Intelligence]]></category>
		<category><![CDATA[Data Science]]></category>
		<category><![CDATA[Machine Learning]]></category>
		<category><![CDATA[R Statistical Language]]></category>
		<category><![CDATA[Future]]></category>
		<category><![CDATA[Machine Intelligence]]></category>
		<category><![CDATA[R Code]]></category>
		<category><![CDATA[Reinforcement Learning]]></category>
		<category><![CDATA[Technology]]></category>
		<guid isPermaLink="false">https://analyticadss.com/?p=4785</guid>

					<description><![CDATA[<p>Reinforcement learning is a type of machine learning that involves the use of algorithms to learn from the consequences of their actions. It is based on the idea that an agent, such as a robot or a computer program, can learn to optimize its behavior by receiving rewards or punishments for its actions. In a [&#8230;]</p>
<p>The post <a href="https://analyticadss.com/what-is-reinforcement-learning/">What is Reinforcement Learning?</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p class="wp-block-paragraph" id="3531"><strong>Reinforcement learning is a type of machine learning that involves the use of algorithms to learn from the consequences of their actions.</strong> It is based on the idea that an agent, such as a robot or a computer program, can learn to optimize its behavior by receiving rewards or punishments for its actions.</p>



<p class="wp-block-paragraph" id="cf4e">In a reinforcement learning system, the agent interacts with its environment by taking actions and observing the resulting rewards or punishments. The goal of the agent is to learn the best possible strategy for maximizing the rewards over time. This is done through trial and error, where the agent explores different actions and learns from their consequences.</p>



<p class="wp-block-paragraph" id="dbf4">One of the key features of reinforcement learning is that the agent can learn from experience, without being explicitly programmed with a set of rules or instructions. This allows the agent to adapt and improve its behavior based on the feedback it receives from the environment.</p>



<h3 class="wp-block-heading">An example of reinforcement learning in action:</h3>



<p class="wp-block-paragraph" id="63e1">An example of reinforcement learning in action is a robot that is trained to navigate a maze. The robot is placed in the maze and must find its way to the goal. As it moves through the maze, it receives rewards for taking actions that bring it closer to the goal and punishments for taking actions that move it away from the goal. Over time, the robot learns the best strategy for navigating the maze and finds the quickest way to the goal.</p>



<h3 class="wp-block-heading">Another example</h3>



<p class="wp-block-paragraph" id="de35">Another example of reinforcement learning is a <strong>computer program that learns to play a game</strong>, such as chess or Go. The program is given the rules of the game and must learn to make the best possible moves based on the rewards and punishments it receives for each action. This requires the program to analyze the current state of the game and consider various possible moves, in order to choose the one that is most likely to lead to a win.</p>



<p class="wp-block-paragraph" id="4807">A <strong>robotic arm</strong> used in a manufacturing setting can be trained using reinforcement learning to perform tasks such as picking up and placing objects. The robotic arm receives rewards for successfully completing the tasks and punishments for making mistakes, and learns to optimize its movements over time.</p>



<p class="wp-block-paragraph" id="ce11">A <strong>virtual personal assistant</strong>, such as Apple’s Siri or Amazon’s Alexa, can use reinforcement learning to improve its performance over time. The assistant receives rewards for providing accurate and helpful responses to user requests, and learns to optimize its decision making and natural language processing abilities based on this feedback.</p>



<p class="wp-block-paragraph" id="65c9">A <strong>stock trading algorithm</strong> can use reinforcement learning to make decisions about buying and selling stocks. The algorithm receives rewards for making profitable trades and punishments for making unprofitable ones, and learns to optimize its predictions and decision making based on this feedback.</p>



<p class="wp-block-paragraph" id="a19b">Here is a simple example of reinforcement learning in Python using the OpenAI Gym library:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.395843505859375px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="import gym

# create the environment
env = gym.make('MountainCar-v0')

# initialize the agent
agent = Agent()

# run the simulation for 100 episodes
for episode in range(100):
    # reset the environment
    state = env.reset()
    
    # run the episode until it is done
    while True:
        # choose an action based on the current state
        action = agent.choose_action(state)
        
        # take the action and observe the reward and next state
        next_state, reward, done, _ = env.step(action)
        
        # update the agent based on the reward and next state
        agent.update(state, action, reward, next_state)
        
        # update the current state
        state = next_state
        
        # if the episode is done, break the loop
        if done:
            break" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #F92672">import</span><span style="color: #F8F8F2"> gym</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># create the environment</span></span>
<span class="line"><span style="color: #F8F8F2">env </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> gym.make(</span><span style="color: #E6DB74">&#39;MountainCar-v0&#39;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># initialize the agent</span></span>
<span class="line"><span style="color: #F8F8F2">agent </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> Agent()</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># run the simulation for 100 episodes</span></span>
<span class="line"><span style="color: #F92672">for</span><span style="color: #F8F8F2"> episode </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">range</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">100</span><span style="color: #F8F8F2">):</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># reset the environment</span></span>
<span class="line"><span style="color: #F8F8F2">    state </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> env.reset()</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># run the episode until it is done</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #F92672">while</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">True</span><span style="color: #F8F8F2">:</span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #88846F"># choose an action based on the current state</span></span>
<span class="line"><span style="color: #F8F8F2">        action </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> agent.choose_action(state)</span></span>
<span class="line"><span style="color: #F8F8F2">        </span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #88846F"># take the action and observe the reward and next state</span></span>
<span class="line"><span style="color: #F8F8F2">        next_state, reward, done, _ </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> env.step(action)</span></span>
<span class="line"><span style="color: #F8F8F2">        </span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #88846F"># update the agent based on the reward and next state</span></span>
<span class="line"><span style="color: #F8F8F2">        agent.update(state, action, reward, next_state)</span></span>
<span class="line"><span style="color: #F8F8F2">        </span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #88846F"># update the current state</span></span>
<span class="line"><span style="color: #F8F8F2">        state </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> next_state</span></span>
<span class="line"><span style="color: #F8F8F2">        </span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #88846F"># if the episode is done, break the loop</span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #F92672">if</span><span style="color: #F8F8F2"> done:</span></span>
<span class="line"><span style="color: #F8F8F2">            </span><span style="color: #F92672">break</span></span></code></pre></div>



<p class="wp-block-paragraph" id="8a26">In this example, we create an environment using the <code>gym.make</code> function and initialize the agent using the <code>Agent</code> class. Then, we run the simulation for 100 episodes, where the agent chooses actions based on the current state and receives rewards based on the actions it takes. The agent is updated after each step, and the simulation ends when the episode is done.</p>



<p class="wp-block-paragraph" id="825c">And here is a somewhat more sophisticated example:</p>



<div class="wp-block-kevinbatdorf-code-block-pro cbp-has-line-numbers" style="font-size:.875rem;--cbp-line-number-color:#F8F8F2;--cbp-line-number-width:15.39581298828125px;line-height:1.25rem"><span style="display:block;padding:16px 0 0 16px;margin-bottom:-1px;width:100%;text-align:left;background-color:#272822"><svg xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14"><g fill="none" fill-rule="evenodd" transform="translate(1 1)"><circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle><circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle><circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle></g></svg></span><span role="button" tabindex="0" data-code="# install the OpenAI Gym and TensorFlow libraries
!pip install gym tensorflow

# import the required libraries
import gym
import numpy as np
import tensorflow as tf

# create the environment
env = gym.make('CartPole-v0')

# create the model
model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(32, activation='relu', input_shape=(4,)),
    tf.keras.layers.Dense(2, activation='linear')
])

# compile the model
model.compile(
    optimizer='adam',
    loss='mse'
)

# define the agent
agent = {
    'model': model,
    'memory': [],
    'epsilon': 1,
    'epsilon_min': 0.01,
    'epsilon_decay': 0.995
}

# define the choose_action function
def choose_action(state):
    # if a random number is less than epsilon, choose a random action
    if np.random.uniform() < agent['epsilon']:
        action = np.random.randint(0, 2)
    else:
        # otherwise, predict the action using the model
        action = np.argmax(model.predict(np.array([state]))[0])
    
    # return the action
    return action

# define the remember function
def remember(state, action, reward, next_state, done):
    # add the experience to the memory
    agent['memory'].append((state, action, reward, next_state, done))
# define the replay function
def replay(batch_size):
    # sample a random batch of experiences from the memory
    batch = np.random.choice(agent['memory'], batch_size)
    
    # create empty arrays for the states, actions, and targets
    states = np.zeros((batch_size, 4))
    actions = np.zeros((batch_size, 1))
    targets = np.zeros((batch_size, 2))
    
    # loop over the experiences in the batch
    for i in range(batch_size):
        # get the state, action, reward, next_state, and done from the experience
        state = batch[i][0]
        action = batch[i][1]
        reward = batch[i][2]
        next_state = batch[i][3]
        done = batch[i][4]
        
        # if the episode is not done, calculate the target
        if not done:
            target = reward + 0.95 * np.max(model.predict(np.array([next_state]))[0])
        else:
            target = reward
        
        # add the state, action, and target to the arrays
        states[i] = state
        actions[i] = action
        targets[i] = target
    
    # update the model using the states, actions, and targets
    model.fit(states, targets, epochs=1, verbose=0)" style="color:#F8F8F2;display:none" aria-label="Copy" class="code-block-pro-copy-button"><svg xmlns="http://www.w3.org/2000/svg" style="width:24px;height:24px" fill="none" viewBox="0 0 24 24" stroke="currentColor" stroke-width="2"><path class="with-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2m-6 9l2 2 4-4"></path><path class="without-check" stroke-linecap="round" stroke-linejoin="round" d="M9 5H7a2 2 0 00-2 2v12a2 2 0 002 2h10a2 2 0 002-2V7a2 2 0 00-2-2h-2M9 5a2 2 0 002 2h2a2 2 0 002-2M9 5a2 2 0 012-2h2a2 2 0 012 2"></path></svg></span><pre class="shiki" style="background-color: #272822"><code><span class="line"><span style="color: #88846F"># install the OpenAI Gym and TensorFlow libraries</span></span>
<span class="line"><span style="color: #F44747">!</span><span style="color: #F8F8F2">pip install gym tensorflow</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># import the required libraries</span></span>
<span class="line"><span style="color: #F92672">import</span><span style="color: #F8F8F2"> gym</span></span>
<span class="line"><span style="color: #F92672">import</span><span style="color: #F8F8F2"> numpy </span><span style="color: #F92672">as</span><span style="color: #F8F8F2"> np</span></span>
<span class="line"><span style="color: #F92672">import</span><span style="color: #F8F8F2"> tensorflow </span><span style="color: #F92672">as</span><span style="color: #F8F8F2"> tf</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># create the environment</span></span>
<span class="line"><span style="color: #F8F8F2">env </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> gym.make(</span><span style="color: #E6DB74">&#39;CartPole-v0&#39;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># create the model</span></span>
<span class="line"><span style="color: #F8F8F2">model </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> tf.keras.models.Sequential([</span></span>
<span class="line"><span style="color: #F8F8F2">    tf.keras.layers.Dense(</span><span style="color: #AE81FF">32</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">activation</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&#39;relu&#39;</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">input_shape</span><span style="color: #F92672">=</span><span style="color: #F8F8F2">(</span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">,)),</span></span>
<span class="line"><span style="color: #F8F8F2">    tf.keras.layers.Dense(</span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">activation</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&#39;linear&#39;</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">])</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># compile the model</span></span>
<span class="line"><span style="color: #F8F8F2">model.compile(</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">optimizer</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&#39;adam&#39;</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #FD971F">loss</span><span style="color: #F92672">=</span><span style="color: #E6DB74">&#39;mse&#39;</span></span>
<span class="line"><span style="color: #F8F8F2">)</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># define the agent</span></span>
<span class="line"><span style="color: #F8F8F2">agent </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> {</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #E6DB74">&#39;model&#39;</span><span style="color: #F8F8F2">: model,</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #E6DB74">&#39;memory&#39;</span><span style="color: #F8F8F2">: [],</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #E6DB74">&#39;epsilon&#39;</span><span style="color: #F8F8F2">: </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #E6DB74">&#39;epsilon_min&#39;</span><span style="color: #F8F8F2">: </span><span style="color: #AE81FF">0.01</span><span style="color: #F8F8F2">,</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #E6DB74">&#39;epsilon_decay&#39;</span><span style="color: #F8F8F2">: </span><span style="color: #AE81FF">0.995</span></span>
<span class="line"><span style="color: #F8F8F2">}</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># define the choose_action function</span></span>
<span class="line"><span style="color: #66D9EF">def</span><span style="color: #F8F8F2"> </span><span style="color: #A6E22E">choose_action</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">state</span><span style="color: #F8F8F2">):</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># if a random number is less than epsilon, choose a random action</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #F92672">if</span><span style="color: #F8F8F2"> np.random.uniform() </span><span style="color: #F92672"><</span><span style="color: #F8F8F2"> agent[</span><span style="color: #E6DB74">&#39;epsilon&#39;</span><span style="color: #F8F8F2">]:</span></span>
<span class="line"><span style="color: #F8F8F2">        action </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> np.random.randint(</span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">)</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #F92672">else</span><span style="color: #F8F8F2">:</span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #88846F"># otherwise, predict the action using the model</span></span>
<span class="line"><span style="color: #F8F8F2">        action </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> np.argmax(model.predict(np.array([state]))[</span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">])</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># return the action</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #F92672">return</span><span style="color: #F8F8F2"> action</span></span>
<span class="line"></span>
<span class="line"><span style="color: #88846F"># define the remember function</span></span>
<span class="line"><span style="color: #66D9EF">def</span><span style="color: #F8F8F2"> </span><span style="color: #A6E22E">remember</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">state</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">action</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">reward</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">next_state</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">done</span><span style="color: #F8F8F2">):</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># add the experience to the memory</span></span>
<span class="line"><span style="color: #F8F8F2">    agent[</span><span style="color: #E6DB74">&#39;memory&#39;</span><span style="color: #F8F8F2">].append((state, action, reward, next_state, done))</span></span>
<span class="line"><span style="color: #88846F"># define the replay function</span></span>
<span class="line"><span style="color: #66D9EF">def</span><span style="color: #F8F8F2"> </span><span style="color: #A6E22E">replay</span><span style="color: #F8F8F2">(</span><span style="color: #FD971F">batch_size</span><span style="color: #F8F8F2">):</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># sample a random batch of experiences from the memory</span></span>
<span class="line"><span style="color: #F8F8F2">    batch </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> np.random.choice(agent[</span><span style="color: #E6DB74">&#39;memory&#39;</span><span style="color: #F8F8F2">], batch_size)</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># create empty arrays for the states, actions, and targets</span></span>
<span class="line"><span style="color: #F8F8F2">    states </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> np.zeros((batch_size, </span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">    actions </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> np.zeros((batch_size, </span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">    targets </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> np.zeros((batch_size, </span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">))</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># loop over the experiences in the batch</span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #F92672">for</span><span style="color: #F8F8F2"> i </span><span style="color: #F92672">in</span><span style="color: #F8F8F2"> </span><span style="color: #66D9EF">range</span><span style="color: #F8F8F2">(batch_size):</span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #88846F"># get the state, action, reward, next_state, and done from the experience</span></span>
<span class="line"><span style="color: #F8F8F2">        state </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> batch[i][</span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">]</span></span>
<span class="line"><span style="color: #F8F8F2">        action </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> batch[i][</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">]</span></span>
<span class="line"><span style="color: #F8F8F2">        reward </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> batch[i][</span><span style="color: #AE81FF">2</span><span style="color: #F8F8F2">]</span></span>
<span class="line"><span style="color: #F8F8F2">        next_state </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> batch[i][</span><span style="color: #AE81FF">3</span><span style="color: #F8F8F2">]</span></span>
<span class="line"><span style="color: #F8F8F2">        done </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> batch[i][</span><span style="color: #AE81FF">4</span><span style="color: #F8F8F2">]</span></span>
<span class="line"><span style="color: #F8F8F2">        </span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #88846F"># if the episode is not done, calculate the target</span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #F92672">if</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">not</span><span style="color: #F8F8F2"> done:</span></span>
<span class="line"><span style="color: #F8F8F2">            target </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> reward </span><span style="color: #F92672">+</span><span style="color: #F8F8F2"> </span><span style="color: #AE81FF">0.95</span><span style="color: #F8F8F2"> </span><span style="color: #F92672">*</span><span style="color: #F8F8F2"> np.max(model.predict(np.array([next_state]))[</span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">])</span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #F92672">else</span><span style="color: #F8F8F2">:</span></span>
<span class="line"><span style="color: #F8F8F2">            target </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> reward</span></span>
<span class="line"><span style="color: #F8F8F2">        </span></span>
<span class="line"><span style="color: #F8F8F2">        </span><span style="color: #88846F"># add the state, action, and target to the arrays</span></span>
<span class="line"><span style="color: #F8F8F2">        states[i] </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> state</span></span>
<span class="line"><span style="color: #F8F8F2">        actions[i] </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> action</span></span>
<span class="line"><span style="color: #F8F8F2">        targets[i] </span><span style="color: #F92672">=</span><span style="color: #F8F8F2"> target</span></span>
<span class="line"><span style="color: #F8F8F2">    </span></span>
<span class="line"><span style="color: #F8F8F2">    </span><span style="color: #88846F"># update the model using the states, actions, and targets</span></span>
<span class="line"><span style="color: #F8F8F2">    model.fit(states, targets, </span><span style="color: #FD971F">epochs</span><span style="color: #F92672">=</span><span style="color: #AE81FF">1</span><span style="color: #F8F8F2">, </span><span style="color: #FD971F">verbose</span><span style="color: #F92672">=</span><span style="color: #AE81FF">0</span><span style="color: #F8F8F2">)</span></span></code></pre></div>



<p class="wp-block-paragraph" id="b369">Of course, these are simple examples, and a real-world reinforcement learning system would be much more complex. But this gives a general idea of how reinforcement learning works in Python using the OpenAI Gym library.</p>



<h3 class="wp-block-heading">Reinforcement limitations</h3>



<p class="wp-block-paragraph" id="9c92">While reinforcement learning is a powerful approach to machine learning, it does have some limitations. One of the main challenges with reinforcement learning is that it can be difficult to define the rewards and punishments that the agent will receive for its actions. This can make it difficult to train the agent to optimize its behavior in a way that aligns with the desired outcomes.</p>



<p class="wp-block-paragraph" id="a50c">Another limitation of reinforcement learning is that it can require a lot of data and computation in order to learn effectively. The agent must explore a wide range of possible actions and receive feedback in order to learn the optimal strategy, which can be time-consuming and resource-intensive.</p>



<p class="wp-block-paragraph" id="5428">Additionally, reinforcement learning can struggle with environments that are highly complex or stochastic, where the consequences of actions are difficult to predict. In these cases, it can be challenging for the agent to learn the optimal strategy and adapt its behavior effectively.</p>



<p class="wp-block-paragraph" id="2004">Overall, while reinforcement learning is a powerful approach to machine learning, it is not a perfect solution and has some limitations that need to be considered. In order to use reinforcement learning effectively, it is important to carefully define the rewards and punishments, ensure that there is enough data and computation available, and carefully consider the complexity of the environment.</p>



<p class="wp-block-paragraph" id="713a">In conclusion, reinforcement learning is a powerful approach to machine learning that allows agents to learn from experience and adapt their behavior based on the feedback they receive. It has many real-world applications, from robotics and gaming to finance and healthcare, and will continue to be an important area of research and development in the future.</p>



<p class="wp-block-paragraph">Read More blogs in AnalyticaDSS Blogs here : <a href="https://analyticadss.com/blog">BLOGS</a></p>



<p class="wp-block-paragraph">Read More blogs in Medium : <a href="https://medium.com/@aousabdo">Medium Blogs</a></p>



<p class="wp-block-paragraph">Read More blogs in R-bloggers : <a href="https://www.r-bloggers.com/">https://www.r-bloggers.com</a></p>
<p>The post <a href="https://analyticadss.com/what-is-reinforcement-learning/">What is Reinforcement Learning?</a> appeared first on <a href="https://analyticadss.com">Analytica Data Science Solutions</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
