-
Notifications
You must be signed in to change notification settings - Fork 0
/
memrise_word_extract.py
53 lines (37 loc) · 1.27 KB
/
memrise_word_extract.py
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
'''
Assumption: words are taken from memrise.com
The words are in a file in this format:
word (parts of speech)
meaning
for example:
aberrant (adjective)
markedly different from an accepted norm
aberration (noun)
a deviation from what is normal or expected
abstain (verb)
choose not to consume or take part in (particularly something enjoyable)
this will be extracted in following format:
word,-,(parts of speech) meaning
For example,
aberrant,-,(adjective) markedly different from an accepted norm
aberration,-,(noun) a deviation from what is normal or expected
abstain,-,(verb) choose not to consume or take part in (particularly something enjoyable)
'''
content = open("gre1000").readlines()
words = content[0::2]
meanings = content[1::2]
if len(words) != len(meanings):
raise ValueError(
"Lengths of word and meaning are supposed to be equal. Aborting.", len(words), len(meanings))
words_only = []
parts_of_speech = []
for word in words:
split_word = str(word).split(' ', 1)
words_only.append(split_word[0])
parts_of_speech.append(split_word[1].strip())
aFile = open("gre1000_formatted", "w")
for i in range(0, len(words)):
line = words_only[i] + ",-," + parts_of_speech[
i] + " " + meanings[i].strip() + "\n"
aFile.write(line)
aFile.close()