Code for data collection, data cleaning and data analysis
The pages linked below contain brief descriptions of scripts for data collection, data cleaning or data analysis in the DigiKAR project. Some of the Jupyter Notebook files (ending in .ipynb
) were initially created for Google Colab and need to be adjusted when used in other environments. In the DigiKAR project, Google Colab was used because we did not have access to an institutional research software infrastructure. Ideally, code should be hosted in non-commercial environments, such as university-hosted computing infrastructures for data science.
To make the Colab notebooks work for you, please carry out the following steps:
- Put the notebook on your own Google Drive, ideally in a folder whose name contains "Colab" so that you can easily identify it later.
- Open the notebook and adjust the directory path according to your own file location. You may also change the paths of the input and output data in the script, depending on your own prefered folder structure. Make sure that all folders you name in the script also exist on Google Drive before you execute the script.
- Select "open with" and connect to the Google Colab app. If you have not used Google Colab before, select the "connect more apps" option and find Colab there.
- Make sure to give Colab all the necessary permissions to run the script and read / write files. If you do not want Colab to access a private Google Drive, you may want to create a new Google account exclusively for research purposes.
INFO
This part of the documentation needs to be updated. The links below do not need to be provided in multiple languages anymore. We can instead link to specific pages in docs for every language.