EMI #1

Miscellaneous notes related to coding, machine learning, data science

Takashi Nakamura, PhD
3 min readSep 6, 2020

Introduction

While I was a teenager, I had to do homework. Regarding mathematics and physics related studies, I had to repeat similar exercises over and over — my cram school teacher told us that “It is rensei” (rensei means training, drilling, formalising in Japanese).

Recently, I learnt many things from my work, however, I realised that I have rarely re-visited what I learnt; therefore, I sometimes forget how to solve a problem which I came across before. This series of post (hope it continues) will be memorandums for my work, in the area of data science, coding, and machine learning.

Rensei: drilling, training

EMI-1: Python. Comprehension and generator

# List
[i for i in range(10)]
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
# Tuple
(i for i in range(10))
# <generator object <genexpr> at 0x7fafa82c1250>
# Set
{i for i in range(10)}
# {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}

I have never tried “tuple comprehension” before and realised there is no such thing. I was then wondering if I could write a similar functionality to range(n)

Naive way: Too much memory use

from typing import Listdef my_range_1(n: int) -> List:
num, nums = 0, []
while num < n:
nums.append(num)
num += 1
return nums
my_range_1(10)
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

A better way: Use generator, yields items instead of a list

def my_range_2(n):
num = 0
while num < n:
yield num
num += 1
[i for i in my_range_2(10)]
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Also, able to add step_size

def my_range_3(start, stop, step_size):
num = start
while num < stop:
yield num
num += step_size
[i for i in my_range_3(0, 20, 3)]
# [0, 3, 6, 9, 12, 15, 18]

EMI-2: Python. Multiple keys for dictionary

I normally use either str or int for a key of the dictionary, rarely used tuple for the keys. I have come across below use case when I was learning Natural Language Processing.

dic = {}dic["key1", "key2"] = "something"
# {('key1', 'key2'): 'something'}
dic.get(('key1', 'key2'))
# 'something'

Then, also I realised I did not intuitively understand append like the process for tuple, for example:

current_tuple = ("I", "like", "coding", "in")
word = "Python"
current_tuple + tuple(word)
# ('I', 'like', 'coding', 'in', 'P', 'y', 't', 'h', 'o', 'n')
current_tuple + (word,)
# ('I', 'like', 'coding', 'in', 'Python')

EMI-3: Python. Extend a list after some data processing for lists

Assume we have a list, which contains multiple lists. We want to apply data processing for each list and extend the processed list, such as:

master_list = []
for each_list in original_list:
master_list.extend(some_function(each_list))

Now, we are interested in removing the first element of the list.

master_list = []
original_list = [["a", "b", "c"], ["d", "e"], ["f", "g", "h", "i"]]
def remove_first_element(my_list):
return my_list[1:]
for each_list in original_list:
master_list.extend(remove_first_element(each_list))

master_list
# ['b', 'c', 'e', 'g', 'h', 'i']

How do we write the above code in a single line?

Initial (wrong) idea: Append like process

[remove_first_element(each_list) for each_list in original_list]
# [['b', 'c'], ['e'], ['g', 'h', 'i']]

Solution1

sum((remove_first_element(each_list) for each_list in original_list), [])
# ['b', 'c', 'e', 'g', 'h', 'i']

Solution2: Use map function

sum(map(remove_first_element, original_list), [])
# ['b', 'c', 'e', 'g', 'h', 'i']

EMI-4: Shell. Export a variable after some processing for different variable

It was a question from my colleague. At that moment, I did not know how to do, but we figured out together. Assume we have a variable MY_INPUT as

$ export MY_INPUT="My Input"
$ echo $MY_INPUT
My Input

We would like to assign the lowercased string to a different variable MY_LOWER_INPUT. In order to make a string to lowercase, we can implement below

$ echo "My Input"  | tr '[:upper:]' '[:lower:]'
my input

Then, how do we assign the string to a variable MY_LOWER_INPUT? At least the code below works, which use echo command and the results are exported to MY_LOWER_INPUT

export MY_LOWER_INPUT=$(echo "$MY_INPUT" | tr '[:upper:]' '[:lower:]')
echo $MY_LOWER_INPUT

What is EMI?

Expose my ignorance. After my PhD, no one corrects or challenges my writing, coding, ML theory, the value of life, etc. I realise I need to take some notes on what I learnt every week.

Endnote

This post was written twice due to the draft has gone while editing. What a great start!

--

--