Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XLS data transfer to Kobo #19

Open
wants to merge 143 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 86 commits
Commits
Show all changes
143 commits
Select commit Hold shift + click to select a range
80b419b
methods added to xml.py, and run2.py can be used to transfer google f…
yaqubdiyaa Nov 26, 2023
42c04d3
can run run2.py form command line (--excel_file 'pathname')
yaqubdiyaa Nov 27, 2023
0e017a9
fixed error so it can run from command line
yaqubdiyaa Nov 28, 2023
10d10c4
working from command line (need to clean)
yaqubdiyaa Dec 2, 2023
c833de7
cleaned (and compatible with original kobo-transfer code)
yaqubdiyaa Dec 2, 2023
105dec7
Update README.md
yaqubdiyaa Dec 2, 2023
433c0aa
Update README.md
yaqubdiyaa Dec 2, 2023
396695b
Merge pull request #1 from yaqubdiyaa/xls-xml
yaqubdiyaa Dec 2, 2023
75043c1
Update README.md
yaqubdiyaa Dec 2, 2023
981a8b1
Update README.md
yaqubdiyaa Dec 2, 2023
4b254c1
Update README.md
yaqubdiyaa Dec 2, 2023
862f06a
Update README.md
yaqubdiyaa Dec 2, 2023
4333769
default xlsx added
yaqubdiyaa Dec 2, 2023
3947c22
Merge pull request #2 from yaqubdiyaa/xls-xml
yaqubdiyaa Dec 2, 2023
0117f6e
Delete Filename(google).xlsx
yaqubdiyaa Dec 2, 2023
4b6343f
Update README.md
yaqubdiyaa Dec 2, 2023
01aaf5f
downloads media from google drive (connected to config file), but err…
yaqubdiyaa Dec 29, 2023
edcde7a
cleaned + refactored (methods extracted to new file)
yaqubdiyaa Dec 29, 2023
28cf4f8
Update README.md
yaqubdiyaa Dec 30, 2023
5e5bb4b
accounted for ranking questions + group questions upload from xlsx to…
yaqubdiyaa Jan 5, 2024
f247a58
Delete ~$Kobo_Platform_Test_-_rank-groups.xlsx
yaqubdiyaa Jan 5, 2024
37b4be0
Delete Filename(google).xlsx
yaqubdiyaa Jan 5, 2024
d3708d0
Update README.md
yaqubdiyaa Jan 5, 2024
758f043
Update README.md
yaqubdiyaa Jan 5, 2024
9f55873
Merge branch 'general-xlsx-kobo'
yaqubdiyaa Jan 5, 2024
6557dcd
hardcoded data.xlsx
yaqubdiyaa Jan 5, 2024
a6fbf05
removed google drive media methods
yaqubdiyaa Jan 5, 2024
b40eee4
Merge pull request #4 from yaqubdiyaa/general-xlsx-kobo
yaqubdiyaa Jan 5, 2024
43ee73b
grouping works with data.xlsx
yaqubdiyaa Jan 5, 2024
a830336
works from command line with python3 run2.py -xt -ef ./data.xlsx
yaqubdiyaa Jan 5, 2024
5739e88
commented out a line for testing so output file not saved in directory
yaqubdiyaa Jan 5, 2024
6ee80a6
Merge pull request #5 from yaqubdiyaa/main
yaqubdiyaa Jan 5, 2024
0b4cc7b
repeating groups works correctly
yaqubdiyaa Jan 6, 2024
59b9af4
Merge pull request #6 from yaqubdiyaa/general-xlsx-kobo
yaqubdiyaa Jan 6, 2024
3483c0b
importing media
yaqubdiyaa Jan 7, 2024
921781f
Update README.md
yaqubdiyaa Jan 7, 2024
e62fe0f
Update README.md
yaqubdiyaa Jan 8, 2024
0182ce2
Update README.md
yaqubdiyaa Jan 8, 2024
41c9436
Update README.md
yaqubdiyaa Jan 8, 2024
08d40e6
Update README.md
yaqubdiyaa Jan 8, 2024
ad30482
combined google+general xml methods, match_kobo method for warning me…
yaqubdiyaa Jan 8, 2024
6c187ba
deleted specific google data-->xml method
yaqubdiyaa Jan 8, 2024
886ecd0
Update README.md
yaqubdiyaa Jan 8, 2024
7ac185d
all none shows up as "" now, repeat groups not working anymore
yaqubdiyaa Jan 8, 2024
79778c3
fixed editing repeat group responses
yaqubdiyaa Jan 9, 2024
56d3dc3
initial submission of repeat group (with no uuid)
yaqubdiyaa Jan 9, 2024
09d4b90
Update README.md
yaqubdiyaa Jan 10, 2024
abb12dd
Update README.md
yaqubdiyaa Jan 11, 2024
34bab1b
Update README.md
yaqubdiyaa Jan 11, 2024
98ab5f0
Update README.md
yaqubdiyaa Jan 11, 2024
3c78db8
Merge pull request #7 from yaqubdiyaa/general-xlsx-kobo
yaqubdiyaa Jan 12, 2024
bc5f01b
changes to speed up transfer
yaqubdiyaa Jan 17, 2024
9acd406
Update README.md
yaqubdiyaa Jan 17, 2024
765cf0a
Update README.md
yaqubdiyaa Jan 17, 2024
48bd5e2
refactored for shorter/cleaner functions
yaqubdiyaa Jan 21, 2024
8942a5d
commented out get_media, del_media, and created single_submission_xml…
yaqubdiyaa Jan 21, 2024
47329e0
added comments
yaqubdiyaa Jan 21, 2024
b20342d
removed unnecessary imports
yaqubdiyaa Jan 21, 2024
0ab6cfd
Update README.md
yaqubdiyaa Jan 21, 2024
b50f254
removed imports in media.py
yaqubdiyaa Jan 27, 2024
10398da
test xlsx (import and downloaded)
yaqubdiyaa Jan 27, 2024
26f17f2
fixed -gt multiple select
yaqubdiyaa Jan 28, 2024
26fdf2b
Merge branch 'main' of https://github.com/yaqubdiyaa/kobo-transfer
yaqubdiyaa Jan 28, 2024
aeb56ae
fixed -gt multiple select
yaqubdiyaa Jan 28, 2024
20be0a6
Update README.md
yaqubdiyaa Jan 28, 2024
498876d
Merge branch 'main' of https://github.com/yaqubdiyaa/kobo-transfer
yaqubdiyaa Jan 29, 2024
db8d558
Merge pull request #11 from yaqubdiyaa/main
yaqubdiyaa Jan 29, 2024
fc5a9cd
deleted demodownloaded xlsx
yaqubdiyaa Jan 29, 2024
9c318c8
edited run.py
yaqubdiyaa Jan 29, 2024
8b05753
Delete DemoDownloaded.xlsx
yaqubdiyaa Jan 29, 2024
70a360c
Delete Kobo_Platform_Test_-_rank-groups.xlsx
yaqubdiyaa Jan 29, 2024
348d13f
Delete run2.py
yaqubdiyaa Jan 29, 2024
34fef1b
Delete DemoProjectDownloadedData.xlsx
yaqubdiyaa Jan 29, 2024
ce9a24d
Delete repeatgroupproject.xlsx
yaqubdiyaa Jan 29, 2024
4293f6d
Merge pull request #12 from yaqubdiyaa/run.py
yaqubdiyaa Jan 29, 2024
e655f6a
Update README.md
yaqubdiyaa Jan 29, 2024
e41830a
Update README.md
yaqubdiyaa Jan 29, 2024
e19a466
Update README.md
yaqubdiyaa Jan 29, 2024
d92aab0
Update README.md
yaqubdiyaa Jan 29, 2024
23ca144
Update README.md
yaqubdiyaa Jan 29, 2024
6b0f8f3
added warning method
yaqubdiyaa Feb 5, 2024
d77c453
Update README.md
yaqubdiyaa Feb 5, 2024
bb1e6c2
Update README.md
yaqubdiyaa Feb 5, 2024
a8e360f
ds
yaqubdiyaa Feb 5, 2024
fad0cda
Merge branch 'run.py'
yaqubdiyaa Feb 5, 2024
e9d3da9
Delete .DS_Store
yaqubdiyaa Feb 5, 2024
64fb19f
Update README.md
yaqubdiyaa Feb 18, 2024
d942c6b
removed google code
yaqubdiyaa Feb 18, 2024
b297beb
reformatted using black
yaqubdiyaa Feb 18, 2024
64fddf5
supports nested groups (not tested w multiple sheets/repeat)
yaqubdiyaa Feb 19, 2024
23d57c2
supports multiple repeat groups (more than 2 sheets)
yaqubdiyaa Feb 19, 2024
2ef083f
nested repeat partially working (check note)
yaqubdiyaa Feb 21, 2024
5aa8c44
nested repeat working
yaqubdiyaa Feb 22, 2024
dbd237f
changed all ==None to is None
yaqubdiyaa Feb 22, 2024
4836e10
- changed == 'end' or == 'start to in['end','start']
yaqubdiyaa Feb 22, 2024
6fe0984
both nested repeat and multiple repeats working
yaqubdiyaa Feb 22, 2024
49c02fa
removed initial_repeat logic
yaqubdiyaa Feb 22, 2024
cc46afe
changed all cell_value == "" to not cell_value
yaqubdiyaa Feb 22, 2024
321e52d
removed all gtransfer logic
yaqubdiyaa Feb 22, 2024
48ed103
made test/new.py repeat function consistent with original
yaqubdiyaa Feb 22, 2024
88a3c39
nested repeat still working, case 2 working (havent checked case 1 bu…
yaqubdiyaa Mar 2, 2024
3a190d8
not using _ in general_, fixed geopoint, passing index in for media
yaqubdiyaa Mar 4, 2024
bae6756
fixed format for uuid:eo923048 (still need to test with missing uuids)
yaqubdiyaa Mar 4, 2024
e90658c
fixed uuid: format, tested project w repeat groups and no uuids (repe…
yaqubdiyaa Mar 4, 2024
3f7019f
media upload and uuid: format working
yaqubdiyaa Mar 4, 2024
d8923da
added readme comments about export formatting (single column, no medi…
yaqubdiyaa Mar 4, 2024
992c6fe
more code blocks
yaqubdiyaa Mar 5, 2024
5e21a8e
all general tests working! all new.py copied onto xlsx_kobo; only che…
yaqubdiyaa Mar 5, 2024
cda17c9
new.py and xlsx_kobo are consistent, added some comments to function …
yaqubdiyaa Mar 5, 2024
3b889d4
deleted unnecessary comments from xlsx_kobo new_repeat(), but kept al…
yaqubdiyaa Mar 5, 2024
7243fd8
changed deprecated+submissionID logic and it worked!!
yaqubdiyaa Mar 8, 2024
1aba66e
changed nsmap_element naming to nsmap_dict
yaqubdiyaa Mar 8, 2024
d1a2d41
fixed all_empty warning message
yaqubdiyaa Mar 8, 2024
d5822ef
changed create_group so that check_for_group isn't needed anymore. no…
yaqubdiyaa Mar 8, 2024
4cc7087
removed the warnings method (clunky, unnecessary, need to ask josh if…
yaqubdiyaa Mar 8, 2024
124e3bb
works for everything (haven't tested transfer) except case 3
yaqubdiyaa Mar 10, 2024
a13ecc0
tested repeat() function with different cases, but not sure if I'm mi…
yaqubdiyaa Mar 10, 2024
d3ff827
unsure about repeat method
yaqubdiyaa Mar 10, 2024
b31cd1a
cleaned methods
yaqubdiyaa Mar 10, 2024
0c0cfa2
replaced "" --> ' '
yaqubdiyaa Mar 13, 2024
01c0ed4
refactored new repeat method
yaqubdiyaa Mar 13, 2024
e4a4d6e
added comments, changed variable name _uid --> submission_xml
yaqubdiyaa Mar 13, 2024
be9b4b2
reformatted
yaqubdiyaa Mar 13, 2024
7798d16
reordered functions
yaqubdiyaa Mar 13, 2024
8acbbd1
Merge pull request #14 from yaqubdiyaa/review
yaqubdiyaa Mar 13, 2024
034c7b3
Rename README.md to xls-transfer/README.md
yaqubdiyaa Mar 13, 2024
8142513
Delete testrun.py
yaqubdiyaa Mar 13, 2024
ae80d2d
Delete GoogleDemoForm (Responses).xlsx
yaqubdiyaa Mar 13, 2024
009c4db
Delete nested_project.xlsx
yaqubdiyaa Mar 13, 2024
296ca92
Delete DemoProjectInitialUpload.xlsx
yaqubdiyaa Mar 13, 2024
f92a7be
Delete normal_group.xlsx
yaqubdiyaa Mar 13, 2024
7153545
Delete transfer/new.py
yaqubdiyaa Mar 13, 2024
40195e2
Create xls-import-requirements.txt
yaqubdiyaa Mar 13, 2024
96027c2
added req files + edited readme
yaqubdiyaa Mar 13, 2024
4d0a8a1
deleted imports
yaqubdiyaa Mar 13, 2024
c7449a0
edited readme
yaqubdiyaa Mar 13, 2024
cbdb81b
readme edit
yaqubdiyaa Mar 17, 2024
f08792f
added tests
yaqubdiyaa Mar 17, 2024
543f092
new tests (group geo)
yaqubdiyaa Mar 17, 2024
b106f5d
group geo edit ++ parent_table outside loop edit
yaqubdiyaa Mar 17, 2024
1e407b4
Update test_description.txt
yaqubdiyaa Mar 17, 2024
20c172b
initial upload example + test
yaqubdiyaa Mar 19, 2024
4502ebb
Merge branch 'main' of https://github.com/yaqubdiyaa/kobo-transfer
yaqubdiyaa Mar 19, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added DemoProjectInitialUpload.xlsx
Binary file not shown.
Binary file added GoogleDemoForm (Responses).xlsx
Binary file not shown.
158 changes: 90 additions & 68 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,87 +1,109 @@
# kobo-transfer

Transfer submissions between two identical projects.
Transfer submissions from XLSX Form to Kobo Project

## Setup

1. Ensure the destination project is deployed and has the same content as the
source project.
Demo: https://drive.google.com/file/d/1yMcsEKqOH3L09O00urFko3iB77PABuFh/view?usp=sharing

1. Clone a copy of this repo somewhere on your local machine:
## Usage

For XLSX:
```bash
git clone https://github.com/kobotoolbox/kobo-transfer
python3 run.py -xt -ef [excel_file_path]
```

1. Copy `sample-config.json` to `config.json` and add your configuration details
for the source (`src`) and destination (`dest`) projects. If both projects
are located on the same Kobo instance, then just duplicate the URL and token
values.

## Usage

For data downloaded from Google Form:
```bash
python3 run.py \
[--config-file/-c] [--limit/-l] [--last-failed/-lf] \
[--keep-media/-k] [--regenerate-uuids/-R] [--no-validate/-N] [--quiet/-q]
python3 run.py -gt -ef [excel_file_path]
```

The original UUID for each submission is maintained across the transfer,
allowing for duplicate submissions to be rejected at the destination project if
the script is run multiple times. If this behaviour is not desired, pass the
`--regenerate-uuids` flag to create new UUIDs for each submission. This may be
necessary when transferring submissions to a project located on the same server.

If submissions contain media attachments, all media will be downloaded to a
local `attachments/` directory before the transfer between projects begin.
Attachment files will be cleaned up after completion of the transfer unless the
`--keep-media` flag is passed.

The `--limit` option can be set to restrict the number of submissions processed
in a batch. For large projects, either in number of submissions or number of
questions or both, it may be necessary to reduce the limit below the default of
30000 to mitigate time-outs from the server.
## Requirements

Sometimes transfers will fail for whatever reason. A list of failed UUIDs is
stored in `.log/failures.txt` after each run. You can run the transfer again
with only these failed submissions by passing the flag `--last-failed`.

If you would like to have a configuration file other than `config.json`, such as
when different configurations are kept in the directory, then specify the file
path with `--config-file`:
Make sure you have the following Python packages installed:

```bash
python3 run.py --config-file config-2.json
pip install openpyxl pandas requests xmltodict python-dateutil
yaqubdiyaa marked this conversation as resolved.
Show resolved Hide resolved
```

By default, the configuration file will be validated before the transfer is
attempted. Pass the `--no-validate` flag to skip this step.

## Media attachments
## Setup

Media attachments are written to the local `attachments/` directory and follow
the tree structure of:

```bash
{asset_uid}
├── {submission_uid}
│   ├── {filename}
│   └── {filename}
├── {submission_uid}
│   └── {filename}
├── {submission_uid}
│   └── {filename}
├── {submission_uid}
│   └── {filename}
└── {submission_uid}
   ├── {filename}
   └── {filename}
```
1. Destination project must be deployed and have the same content as xlsx form. All questions should be in same order.

2. Clone a copy of this repo somewhere on your local machine

3. Copy `sample-config.json` to `config.json` and add your configuration details
for the source (`src`) and destination (`dest`) projects. If transfering from xls to kobo, duplicate src and destination url and token.

### Notes for General XLSX Data Transfer
- for initial transfer from xlsx to kobo, when there is no uuid, the column header for repeat groups must be repeat/{group name}/{xml header for question in repeating group}. Each question in repeat group must have its own column in xlsx.
- to associate media with a specific response/submission for initial transfer, when there is no _uuid column, row number of submission can be used. Media for the submission can be saved in file path ./attachments/{asset_uid}/{row_number}.
- for logical groups, column header can be {group name}/{xml header for question in repeating group}. The group name in the header must match Data Column Name in Kobo project.
- for ranking data, headers must be in this format: {xml header for question}/_1st_choice, {xml header for question}/_2nd_choice, {xml header for question}/_3rd_choice, and so on.

- if data downloaded from kobo as xlsx with XML values and headers, repeat groups span across multiple tabs. Script supports this, and data for these repeat group responses can be edited in the respective tabs and reuploaded.
- to minimise errors with formatting when data needs to be cleaned, it's best to do initial transfer from xlsx, then download the kobo data from xlsx with XML values and headers. The downloaded xlsx from Kobo is best to work from since all headers will be in the format the script expects.

### Notes for Google Form Data Transfer

To ensure that destination project has the same content as Google Form. The corresponding question types for a Google Form, and Kobo is listed below:

|Google Question Type | Kobo Question Type |
| -------- | -------- |
| Multiple Choice | Select one |
| Short Answer, Paragraph |Text|
| Checkboxes | Select Many|
| Linear Scale | Range |
| Dropdown | Select One|
| Date | Date |
| Time | Time |

To download google form responses as xlsx:
Open Google Form data (Responses tab) in Google Sheets, and download as xlsx from there (file → download as → microsoft excel (xlsx))

- As long as google question labels match kobo question labels, they don't need to be edited in the downloaded xlsx form before running transfer. Timestamp is automatically saved in Google Form responses, and the time format nor the column label need to be edited. Transfer tool will record this as the 'end' time when saving to Kobo project.
- Each of the selected items for multiple select responses in Google Forms, are saved in a single cell and separated by commas when downloaded. If only one option is selected, need to add ',' at the end of the response.
- Time and Date question type responses also don't need to be edited. Running the transfer will convert them to the correct format for Kobo.

## Edge Cases
- order of questions in google form, and kobo form can be different
- no effect if some questions in kobo form are not present in google form (the response cells for that column will just be empty)
- responses that are left blank in google form results show up correctly (also blank) in kobo
- if the question strings in kobo form and google form are not exact match, transfer will add columns in kobo data for the "extra" questions in google form

### -w (when warning flag -w is passed)
- prints a warning if question strings/labels in kobo form, and xls seem similar (differences in capitalisation, spacing, and punctuation), but not the same.
- prints warning if number of questions in kobo form and xlsx form do not match

## Limitations

- Although submissions will not be duplicated across multiple runs of the
script, if the submissions contain attachment files, the files are duplicated
on the server.
- The script does not check if the source and destination projects are identical
and will transfer submission data regardless.
- If transferring from google form xlsx data (running -gt), submissions will be duplicated each time the script is run. Even when google form xlsx data is uploaded, edited, and then reuploaded, it will show up as a new submission instead of editing the one in kobo. To avoid this, after transferring from google form xlsx into kobo once, download the kobo data in xlsx form and edit/reupload that one with the flag -xt.
- Similarly, if running -xt for initial xlsx data without uuid, submissions will be duplicated each time script is run. To avoid, after initial transfer, download data from kobo as xlsx and edit/work with that.

<br>

- assumes that kobo project and xlsx form question types and labels match (does not throw error but transferred submissions will be recorded incorrectly)
- labels in xls can not contain '/' if it is not a repeating, or logical group
- any data transferred will be accepted by kobo. For example, if question type is number in kobo, but submission transferred is text/string, it will be saved as such. For questions types such as dropdown/select one, responses in xlsx can be any string and kobo will save (regardless of whether or not it is an option in the kobo project).
- _submitted_by in Kobo will show username of account running the transfer, for all submissions.
- submission_time in Kobo will show the time transfer was completed. 'end' shows time of response submission.
- data could be recorded in Kobo as 'invalid' but code will not throw error in this case. For example, if date or time format is incorrect when uploading to a Kobo Date or Time question, it will save as "Invalid".
- If ‘None’ is a response in submission, it will show up as blank after being transferred to kobo
- Although submissions will not be duplicated across multiple runs of the script, if the submissions contain attachment files, the files are duplicated on the server.

<br>

- if running -gt, responses and text submissions can not contain ',' or '/' since data will be transferred to Kobo incorrectly.
- if running -gt, text submissions will be changed; all commas will show up as a space character, all text will be lowercase
- Does not support Google question types multiple choice grid, tick box grid, and file attachments.
- Google sheets does not have a ‘start’ and ‘end’ like kobo does; it only records submission time. Submission time data will show up in ‘end’ column in kobo project when running -gt
- For time question types in kobo, time zone is recorded. Time question types in google sheets does not have the same feature. Time will not show UTC + ___.

## Media Upload
Demo walks through this process:
- for initial data upload, create folder named attachments, with subfolder name being the asset uid of form in kobo. To associate media with a specific submission, create subfolders named after the row number of the submission in the xlsx. For example, without initially having a uuid (in a case where data is imported from a different source), the file path for the media would be attachments/aMhhwTacmk9PLEQuv9etDS/2. Media within that folder must match the filename in xlsx form cell exactly.
- if data already has uuids, create attachments folder with a subfolder asset id. Within asset id subfolder, create folders each named after uuid of a response. Media associated with each uuid should be within that folder.
- If run.py is run with the attachments folder, media should save in kobo correctly.

## Notes regarding media uploaded as a response in google forms
Google form attachments are saved in a folder in Google Drive and folders are categorised by questions. If the google drive folder path is specified, it's possible to have all the images transferred to Kobo and have them show up in Gallery View.
For the filter view of these images, where a question is selected in Kobo, and images submitted for that question appears, user would need to manually specify which google drive folder path corresponds to each question. It's also not possible to link an image to a specific submission.

Right now, I've ony figured out how to transfer images, and I'm not sure if other types are possible to transfer. Given all these limitations because of the access rules in google drive, is it worth implementing? The Google drive image transfer to Kobo gallery works but is not included in the main branch since there are a few bugs and I'm not sure it makes sense to have if each drive link needs to be listed with each question?
65 changes: 58 additions & 7 deletions run.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,14 @@
print_stats,
transfer_submissions,
)
from transfer.xlsx_kobo import general_xls_to_xml


def main(
warnings,
gtransfer,
xtransfer,
excel_file,
limit,
last_failed=False,
keep_media=False,
Expand All @@ -23,24 +28,31 @@ def main(
validate=True,
config_file=None,
):

config = Config(config_file=config_file, validate=validate)
config_src = config.src

print('📸 Getting all submission media', end=' ', flush=True)
get_media()
if not gtransfer and not xtransfer:
print('📸 Getting all submission media', end=' ', flush=True)
get_media()

xml_url_src = config_src['xml_url'] + f'?limit={limit}'

if last_failed and config.last_failed_uuids:
xml_url_src += f'&query={json.dumps(config.data_query)}'

all_results = []

submission_edit_data = get_submission_edit_data()

print('📨 Transferring submission data')

def transfer(all_results, url=None):
parsed_xml = get_src_submissions_xml(xml_url=url)
if (xtransfer or gtransfer):
parsed_xml = general_xls_to_xml(excel_file, submission_edit_data, gtransfer, warnings)
else:
parsed_xml = get_src_submissions_xml(xml_url=url)

submissions = parsed_xml.findall(f'results/{config_src["asset_uid"]}')
next_ = parsed_xml.find('next').text
results = transfer_submissions(
Expand All @@ -52,11 +64,12 @@ def transfer(all_results, url=None):
all_results += results
if next_ != 'None' and next_ is not None:
transfer(all_results, next_)

transfer(all_results, xml_url_src)

if not keep_media:
del_media()

if not xtransfer and not gtransfer:
if not keep_media:
del_media()

print('✨ Done')
print_stats(all_results)
Expand All @@ -66,6 +79,34 @@ def transfer(all_results, url=None):
parser = argparse.ArgumentParser(
description='A CLI tool to transfer submissions between projects with identical XLSForms.'
)

parser.add_argument(
'--print-warnings',
'-w',
default = False,
action = 'store_true',
help='Print warnings if questions in Kobo form do not match XLS form.',
)

parser.add_argument(
'--google-transfer',
'-gt',
default = False,
action = 'store_true',
help='Complete transfer from Google Form data to Kobo project.',
)
parser.add_argument(
'--excel-transfer',
'-xt',
default = False,
action = 'store_true',
help='Complete transfer from any xlsx form to Kobo project.',
)
parser.add_argument(
'--excel-file',
'-ef',
help='Excel file path for data to upload',
)
parser.add_argument(
'--limit',
'-l',
Expand Down Expand Up @@ -117,8 +158,15 @@ def transfer(all_results, url=None):
)
args = parser.parse_args()

if (args.excel_transfer and not args.excel_file) or (args.google_transfer and not args.excel_file):
parser.error("If --excel-transfer (-xt) or --google-transfer (-gt) is passed, --excel-file (-xt) is required.")

try:
main(
warnings = args.print_warnings,
gtransfer= args.google_transfer,
xtransfer = args.excel_transfer,
excel_file=args.excel_file,
limit=args.limit,
last_failed=args.last_failed,
regenerate=args.regenerate_uuids,
Expand All @@ -127,7 +175,10 @@ def transfer(all_results, url=None):
validate=not args.no_validate,
config_file=args.config_file,
)

except KeyboardInterrupt:
print('🛑 Stopping run')
# Do something here so we can pick up again where this leaves off
sys.exit()


Binary file added transfer/.DS_Store
Binary file not shown.
30 changes: 23 additions & 7 deletions transfer/media.py
Original file line number Diff line number Diff line change
@@ -1,24 +1,39 @@
import argparse
import json
import os
import pathlib
import re
import requests
import shutil
import sys
import time

from helpers.config import Config


import os
import shutil



def rename_media_folder(submission_data, uuid, rowNum):
yaqubdiyaa marked this conversation as resolved.
Show resolved Hide resolved
current_attachments_path = os.path.join(
Config.ATTACHMENTS_DIR, submission_data['asset_uid'], str(rowNum)
)
if (os.path.exists(current_attachments_path)):
yaqubdiyaa marked this conversation as resolved.
Show resolved Hide resolved
new_attachments_path = os.path.join(
Config.ATTACHMENTS_DIR, submission_data['asset_uid'], str(uuid)
)
try:
# Move the folder to the new path
shutil.move(current_attachments_path, new_attachments_path)

except Exception as e: #TODO
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO?

print(f"Error: {e}")

def del_media():
config = Config().src
media_path = os.path.join(Config.ATTACHMENTS_DIR, config['asset_uid'])
if os.path.exists(media_path):
print('🧹 Cleaning up media (pass `--keep-media` to prevent cleanup).')
shutil.rmtree(media_path)


def get_media(verbosity=0, chunk_size=1024, throttle=0.1, limit=1000, query=''):
config = Config().src
config.update(
Expand All @@ -29,9 +44,10 @@ def get_media(verbosity=0, chunk_size=1024, throttle=0.1, limit=1000, query=''):
'throttle': throttle,
}
)

stats = download_all_media(
data_url=config['data_url'],
stats=get_clean_stats(),
data_url=config['data_url'],
stats=get_clean_stats(),
)


Expand Down
Loading