EMI #2

Miscellaneous notes related to coding, machine learning, data science

Already a month has passed from the last post of EMI. Here are some findings from my daily work.

EMI-5: Python. Jupyter notebook, execute Terminal command

I knew the use of command, which enables to execute Linux command on Jupyter notebook, but I have never used before.

echo command on Jupyter Notebook
my_dir
├── img_1.png
├── img_2.png
├── img_3.png
└── img_4.jpg

EMI-6: Python. Find the longest sub-list from the main-list.

When I was studying NLP, I wanted to find the longest list from the main (i.e. find the largest number of words in the batch). I made a mistake in the use of function.

# main list
m_l = [[32, 37, 4, 999999999, 43],
[30, 156, 78, 3614, 25, 11, 169, 3096, 21],
[1, 2], [0], [], []]
# simple "max" finds the list, which has the max value
max(m_l) # [32, 37, 4, 999999999, 43]
# find the longest sub list from main list
max(m_l, key=lambda s_l: len(s_l))
# [30, 156, 78, 3614, 25, 11, 169, 3096, 21]

If we have a different type of values in the main (e.g. ), we need to add to check the type of object.

max(m_l, key=lambda s_l: len(s_l) if isinstance(s_l, list) else 0))

The page explains a different example using

square = {2: 4, -3: 9, -1: 1, -2: 4}# the largest key
max(square) # 2
# the key whose value is the largest
max(square, key = lambda k: square[k]) # -3

Or alternatively, use and

import operator
stats = {'a':1000, 'b':3000, 'c': 100}
max(stats.iteritems(), key=operator.itemgetter(1))[0] # "b"

EMI-7: Yaml. Update environment variables and substitute

For Kubernetes clusters, we can configure environment variables. I came across a situation that I would like to apply some files to running replicas, but the file can be re-used and flexible for different deployment versions, project name etc. I ended up the idea:

  1. Export environment variables ( command)
  2. With the environment variables, update file ( command)

file is defined as:

export MY_APP_VERSION=Version1.2.3.4
export MY_SCHEME_ONE="To Be Defined"

file is below

info:
version: $MY_APP_VERSION
schemes:
- $MY_SCHEME_ONE

Run the command line

$ source my_env_val.txt
$ envsubst <initial.yaml> after.yaml

The file is now updated as

info:
version: Version1.2.3.4
schemes:
- To Be Defined

EMI-8: Python. Iter function for a string.

For NLP application, we feed a single word into some function iteratively to process the target sentence. I guess it’s such a rookie mistake, but I did not know how the works for a and a .

# Iter for a string
str_iter = iter("a blue sky")
for i in range(3):
print(i, next(str_iter))
# 0 a
# 1
# 2 b
# Iter for list
list_iter = iter(["a blue sky"])
print(next(list_iter))
# a blue sky

What should I have done is or the target string into a and feed use function.

EMI-9: Python. Iiteral_eval for JSON string

I think I have used more than a dozen time, in order to, convert format of . However, I always forget.

import ast

s = '["a", "b", "c"]'

l = ast.literal_eval(s)
print(l)
# ['a', 'b', 'c']

Found an example on NLP course I am taking, to load multiple files.

def convert_json_examples_to_text(filepath):
example_jsons = list(map(ast.literal_eval, open(filepath)))
# Then, Read in the json from the example file

Data scientist and machine learning engineer. PhD in Signal Processing for Neuroscience. https://www.linkedin.com/in/takashi-nakamura-004875a6/