EMI #1
Miscellaneous notes related to coding, machine learning, data science
--
Introduction
While I was a teenager, I had to do homework. Regarding mathematics and physics related studies, I had to repeat similar exercises over and over — my cram school teacher told us that “It is rensei” (rensei means training, drilling, formalising in Japanese).
Recently, I learnt many things from my work, however, I realised that I have rarely re-visited what I learnt; therefore, I sometimes forget how to solve a problem which I came across before. This series of post (hope it continues) will be memorandums for my work, in the area of data science, coding, and machine learning.
EMI-1: Python. Comprehension and generator
# List
[i for i in range(10)]
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]# Tuple
(i for i in range(10))
# <generator object <genexpr> at 0x7fafa82c1250># Set
{i for i in range(10)}
# {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
I have never tried “tuple comprehension” before and realised there is no such thing. I was then wondering if I could write a similar functionality to range(n)
Naive way: Too much memory use
from typing import Listdef my_range_1(n: int) -> List:
num, nums = 0, []
while num < n:
nums.append(num)
num += 1
return numsmy_range_1(10)
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
A better way: Use generator, yields items instead of a list
def my_range_2(n):
num = 0
while num < n:
yield num
num += 1[i for i in my_range_2(10)]
# [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Also, able to add step_size
def my_range_3(start, stop, step_size):
num = start
while num < stop:
yield num
num += step_size[i for i in my_range_3(0, 20, 3)]
# [0, 3, 6, 9, 12, 15, 18]
EMI-2: Python. Multiple keys for dictionary
I normally use either str
or int
for a key of the dictionary, rarely used tuple
for the keys. I have come across below use case when I was learning Natural Language Processing.
dic = {}dic["key1", "key2"] = "something"
# {('key1', 'key2'): 'something'}dic.get(('key1', 'key2'))
# 'something'
Then, also I realised I did not intuitively understand append
like the process for tuple
, for example:
current_tuple = ("I", "like", "coding", "in")
word = "Python"current_tuple + tuple(word)
# ('I', 'like', 'coding', 'in', 'P', 'y', 't', 'h', 'o', 'n')current_tuple + (word,)
# ('I', 'like', 'coding', 'in', 'Python')
EMI-3: Python. Extend a list after some data processing for lists
Assume we have a list, which contains multiple lists. We want to apply data processing for each list and extend the processed list, such as:
master_list = []
for each_list in original_list:
master_list.extend(some_function(each_list))
Now, we are interested in removing the first element of the list.
master_list = []
original_list = [["a", "b", "c"], ["d", "e"], ["f", "g", "h", "i"]]def remove_first_element(my_list):
return my_list[1:]for each_list in original_list:
master_list.extend(remove_first_element(each_list))
master_list
# ['b', 'c', 'e', 'g', 'h', 'i']
How do we write the above code in a single line?
Initial (wrong) idea: Append like process
[remove_first_element(each_list) for each_list in original_list]
# [['b', 'c'], ['e'], ['g', 'h', 'i']]
Solution1
sum((remove_first_element(each_list) for each_list in original_list), [])
# ['b', 'c', 'e', 'g', 'h', 'i']
Solution2: Use map function
sum(map(remove_first_element, original_list), [])
# ['b', 'c', 'e', 'g', 'h', 'i']
EMI-4: Shell. Export a variable after some processing for different variable
It was a question from my colleague. At that moment, I did not know how to do, but we figured out together. Assume we have a variable MY_INPUT
as
$ export MY_INPUT="My Input"
$ echo $MY_INPUT
My Input
We would like to assign the lowercased string to a different variable MY_LOWER_INPUT
. In order to make a string to lowercase, we can implement below
$ echo "My Input" | tr '[:upper:]' '[:lower:]'
my input
Then, how do we assign the string to a variable MY_LOWER_INPUT
? At least the code below works, which use echo
command and the results are exported to MY_LOWER_INPUT
export MY_LOWER_INPUT=$(echo "$MY_INPUT" | tr '[:upper:]' '[:lower:]')
echo $MY_LOWER_INPUT
What is EMI?
Expose my ignorance. After my PhD, no one corrects or challenges my writing, coding, ML theory, the value of life, etc. I realise I need to take some notes on what I learnt every week.
Endnote
This post was written twice due to the draft has gone while editing. What a great start!