Working on TSD

– LCBC general TSD tools –

Athanasia M. Mowinckel

Why do we use TSD?

  • The only Norwegian service that provide storage of black (sensitive) data

  • Also connected to a computing cluster for heavy computational loads

  • Service is affordable, compared to other possible cloud computing and storage systems

What are the challenges?

  • Staff are unfamiliar with Linux

  • The general file structure and navigation is unfamiliar

  • The lack of “simple” UI tools makes it hard for junior staff to do certain tasks

  • Access is slow and laggy

  • Keyboard mappings are not always correct/as expected

  • Old operating systems makes some tools hard to get working

What do we use TSD for?

Almost everything

  • data storage

  • analysis/computing

  • data validation / quality assertion

  • data logging

  • software development

Getting data in to TSD

How does data get into TSD?

  • Nettskjema

    • Telephone interviews

    • Questionnaires

    • Data entered by our staff

    • Logs entered by our staff

    • Computer tasks

    • File uploads

  • S3 bucket

    • Larger files, like MRI / genetics etc
  • Data portal

    • Most day-to-day imports of smaller files

Why do we use Nettskjema so actively?

  • anyone with a smartphone and internet can access forms to input data (i.e. we can use it “on the go”)
  • data entry can be standardised, so that there is less possibility to enter incorrect or impossible data
  • data is sent directly to TSD, so the data is always secure and is not easily lost
  • we don’t have issues with multiple people working on the same local file, locking it for editing by others
  • data can enter TSD by staff/partners without TSD access

Working inside TSD

TSD userland services (TULS)

  • All users like to have UI components to work with

  • Admin team set up tools:

    • service-user: for general tools or run pipelines for general purpose

    • service-machine: services run on dedicated VM

Screenshot of TULS running in TSD

Screenshot of TULS running in TSD

Service status

Gitea - Git with a cup of tea

  • GitHub/GitLab like server inside TSD

  • Having a git server inside TSD aids clean collaboration

  • Syncing repos between GitHub/Gitea is cleaner

Gitea

Nettskjema services

Our data-base

Example of MRI data checking tool

Tools

  • Admins/Managers install various tools in common user folders

  • Set up distinct alias’es for accessing these tools & path shortcuts

  • Users are recommended to source aliases into their .barshrc for easy access

    • Can at times collide with system set-ups by USIT or the Colossus module system

    • Needs to be done sparingly

What can TSD improve for the end users?

  • Increased communication with users to inform development

  • An online user forum to connect users

  • RStudio Package Manager for R and python

End