Skip to content
GitLab
Menu
Projects
Groups
Snippets
Loading...
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in / Register
Toggle navigation
Menu
Open sidebar
umr-tetis
MOOD
mood-tetis-tweets-collect
Commits
c9838f7b
Commit
c9838f7b
authored
Jan 17, 2022
by
Rémy Decoupes
Browse files
fix when fix_bad_quote is stopped and restarted
parent
2132c4ec
Changes
1
Hide whitespace changes
Inline
Side-by-side
elasticsearch/src/fix_bad_quote_json.py
View file @
c9838f7b
...
@@ -43,14 +43,16 @@ logger.info("Transform jsonl single quotes into double quotes")
...
@@ -43,14 +43,16 @@ logger.info("Transform jsonl single quotes into double quotes")
for
root
,
dirs
,
files
in
os
.
walk
(
path_dir_in
):
for
root
,
dirs
,
files
in
os
.
walk
(
path_dir_in
):
for
name
in
files
:
for
name
in
files
:
fr
=
open
(
path_dir_in
+
"/"
+
name
)
fr
=
open
(
path_dir_in
+
"/"
+
name
)
fw
=
open
(
path_dir_out
+
"/"
+
name
,
"w"
)
fw
=
open
(
path_dir_out
+
"/"
+
name
)
nb_lines_in
=
sum
(
1
for
line
in
fr
)
nb_lines_in
=
sum
(
1
for
line
in
fr
)
try
:
try
:
nb_lines_out
=
sum
(
1
for
line
in
fw
)
nb_lines_out
=
sum
(
1
for
line
in
fw
)
except
:
#file is empty
except
:
#file is empty
nb_lines_out
=
0
nb_lines_out
=
0
logger
.
info
(
"file: "
+
name
+
" in: "
+
str
(
nb_lines_in
)
+
" and out:"
+
str
(
nb_lines_out
))
if
nb_lines_in
!=
nb_lines_out
:
if
nb_lines_in
!=
nb_lines_out
:
fr
.
seek
(
0
)
# go to the start of the file
fr
.
seek
(
0
)
# go to the start of the file
fw
=
open
(
path_dir_out
+
"/"
+
name
,
"w"
)
for
line
in
fr
:
for
line
in
fr
:
json_dat
=
json
.
dumps
(
ast
.
literal_eval
(
line
))
json_dat
=
json
.
dumps
(
ast
.
literal_eval
(
line
))
dict_dat
=
json
.
loads
(
json_dat
)
dict_dat
=
json
.
loads
(
json_dat
)
...
...
Write
Preview
Markdown
is supported
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment