Drew DiPalma
Drew DiPalmaOct 18, 2023

Introducing Data Explorer

We are delighted to announce that Seqera’s Data Explorer is now available in Public Preview in Seqera Platform! The Data Explorer lets you easily visualize, search, and manage data across different cloud providers – simplifying linking data to pipelines, troubleshooting runs, and examining outputs – all without switching context.

The need for simpler data management

Today, Nextflow provides powerful integrations with cloud object stores, including Amazon S3, Azure Blob Storage, and Google Cloud Storage. Users can simply provide storage credentials for their preferred cloud in a nextflow.config file and add paths to cloud-resident datasets in pipelines. Nextflow looks after the rest, transparently managing the process of copying data to and from cloud object stores during pipeline execution.

This still leaves lots of manual data wrangling alongside pipeline development and deployment – including staging data, managing files, and retrieving results from pipeline runs.

While cloud-specific CLIs, web-based interfaces such as the AWS S3 Console or Azure Storage Explorer, and freeware clients such as S3 browser can all help, the need for multiple tools makes data management complex and reduces efficiency — particularly when working across clouds, regions or accounts.

As we talked with users across the community, we saw an opportunity to streamline how data worked with scientific pipelines from the moment that data landed in cloud storage through iterative development, troubleshooting, and analyzing results. We started simplifying this process with Datasets, a convenient metadata layer to organize versioned, structured data. However, we aimed to do more to enable users to manage their data and analyses in one simple workflow.

Introducing Data Explorer

Data Explorer simplifies data management across multiple cloud object stores. With Data Explorer, you can:

  • Browse, search for, preview, and upload data to cloud object stores prior to pipeline submission.
  • Easily link data to pipelines
  • Quickly view pipeline outputs or dive into task and working directory data
Seqera Data Explorer

With Data Explorer, users can browse cloud buckets across providers, or search within buckets by name, cloud provider, region, or other attributes. Data Explorer will even automatically index buckets using the workspace credentials.

Data Explorer makes managing storage credentials a breeze. Once a bucket is configured, users have streamlined management of data with the ability to add, remove, or hide buckets from the list to control what data is visible and maintainer users can also upload, preview, and download files. This not only makes users more productive, but also keeps data secure by reducing the number of people who need access to cloud storage credentials.

Simpler Data Integration for Researchers

With the ability to manage data in cloud storage through the Seqera UI, researchers can:

  • Easily access and view datasets from multiple clouds.
  • Reduce complexity by reducing the need to interact with third-party CLIs or tools for data management.
  • Improve productivity by enabling shared access to public and private data from within the Seqera Platform

Data Explorer functionality is available in Seqera Cloud effective today and will be available in Seqera Enterprise in release 23.3.

To learn more and obtain a free Seqera Cloud account, visit seqera.io.