How to Version, Debug, Compare and Share Jupyter Notebooks (2024)

ML model development has improved by leaps and bounds, and Jupyter Notebooks have been a big factor in this change. Owing to its interactive development, support for markdowns, and LaTex, a huge repository of plugins, it has become a go-to tool for any Data scientist or ML practitioner.

The popularity of Notebooks in this space has led to many offerings in ML experiment tracking as they don’t come with a native tracking feature (as of writing this article), so one has to look for solutions elsewhere. In this blog, we’re going to touch bases on:

  • 1 Why it’s important to version Notebooks?
  • 2 Different ways to version, debug and compare experiments done in Notebooks.
  • 3 How can neptune.ai help track, debug and compare Jupyter Notebooks?

Why should you version Notebooks?

Building ML models is experimentative in nature, and it’s common to run numerous experiments in search of a combination of an algorithm, parameters, and data preprocessing steps that would yield the best model for the task at hand. This requires some form of organization once the complexity of the problem grows.

While running experiments in Notebooks, you’ll feel the need for versioning and tracking the same way you would if you were building your ML models in another IDE. Here are some key points on why you should adopt the best practice of setting up some form of versioning for your ML experiments:

How to Version, Debug, Compare and Share Jupyter Notebooks (1)
  1. Collaboration: Working in a team requires a collaborative effort in decision making which would become cumbersome if there are no centrally logged experiment details like model metadata, metrics, etc.
  2. Reproducibility: It saves a lot of time for retraining and testing if you are logging the model configurations somewhere. By taking snapshots of the entire Machine Learning pipeline, it becomes possible to reproduce the same output again.
  3. Dependency tracking: By using version control, you can track different versions of the datasets (training, validation, and test), test more than one model on different branches or repositories, tune the model parameters and hyperparameters, and monitor the accuracy of each change.
  4. Model updates: Model development is not done in one step, it works in cycles. With the help of version control, you can control which version is released while continuing the development for the next release.

How to version Jupyter Notebooks?

There are many ways to version experiments you run in notebooks, ranging from simple log files to full-scale experiment tracking tools that offer a lot of features. Let’s talk about some from each category and understand what would be the right choice, given your requirements.

1. Tracking Notebooks in spreadsheets

How to Version, Debug, Compare and Share Jupyter Notebooks (2)

Tracking ML Experiments in Excel or Google spreadsheets is a fast yet brute-force solution. Spreadsheets provide a comfortable easy-to-use experience to directly paste your metadata and create multiple sheets for multiple runs. But it comes with lots of caveats, let’s see where it shines and where it doesn’t:

Pros

  1. Easy to use with a familiar interface.
  2. Reports for stakeholders can be directly created within the tool.
  3. It can be a boon for non-technical folks on the team to contribute.

Cons

  1. Tracking experiment in spreadsheets is a tedious affair, where you would either need to copy and paste model metadata and metrics onto the spreadsheet or use a module like pandas to log information and later save it to a spreadsheet.
  2. Once the number of experiments increases, it will get unmanageable to log each run in a separate sheet.
  3. Tracking and managing countless variables and artifacts in a simple spreadsheet is not the best way to approach the problem.

2. Versioning Notebooks using Git

How to Version, Debug, Compare and Share Jupyter Notebooks (3)

Git can be a versatile tool for your project. It can not only track changes in your notebook but can serve as the version control tool for your entire project. With its power, you can push model-related metadata like trained weights, evaluation reports like confusion matrix, etc, to a central repository that your Data Science team can use to make informed decisions. Let’s look at some pros and cons of using Git for experiment tracking:

Pros

  1. A single version control system for all code and notebook files.
  2. A popular tool in the tech community.
  3. It gives access to millions of other repositories which can be used as a starting point.

Cons

  1. Hard to onboard non-programmers and other stakeholders.
  2. An unintuitive interface that may create friction for collaborative work.
  3. Need technical expertise to execute and maintain experiment-related repositories.

3. Versioning Notebooks with experiment tracking tools

How to Version, Debug, Compare and Share Jupyter Notebooks (4)

Experiment tracking tools are tailor-made for this use case. They cover almost all of the requirements you might want from a metadata management tool, from experiment tracking to model registry. There have been a lot of tools in this space in the last few years, with prominent players being neptune.ai, Weights and Biases, and MLflow. Let’s look at some advantages/disadvantages of these tools:

Pros

  1. Covers all the functionalities you need while organizing your experiment runs.
  2. All of these tools come with a dedicated interactive UI that can be used for comparisons, debugging, or report generation.
  3. Each tool offers a plethora of features for team collaboration.

Cons

  1. Unlike Git or spreadsheets, experiment tracking tools usually come with a fee. Although almost all of them have a free tier for a single user, it has its limitations. But on the other hand, paying for the tool means you don’t have to worry about the setup, maintenance, or developing features.

Explore more tools

15 Best Tools for ML Experiment Tracking and Management

There may be numerous makeshift solutions pertaining to your specific problem to experiment tracking. A lot of legacy tools can solve a few areas for tracking and organizing your ML experiment. But if you want a full-fledged fix to your organization’s needs, you should ideally go for an experiment tracking tool.

Tracking, debugging, and comparing Jupyter Notebooks in Neptune

neptune.ai is an ML metadata store that was built for research and production teams that run many experiments. It has a flexible metadata structure that allows you to organize training and production metadata the way you want to.

It gives you a central place to log, store, display, organize, compare, and query all metadata generated during the machine learning lifecycle. Individuals and organizations use Neptune for experiment tracking and model registry to keep their experimentation and model development under control.

The web app was built for managing ML model metadata and it lets you:

  • filter experiments and models with an advanced query language.
  • customize which metadata you see with flexible table views and dashboards.
  • monitor, visualize, and compare experiments and models.

Neptune-Jupyter extension

Neptune offers seamless integration with Jupyter Notebooks using the Neptune–Jupyter extension. You can directly utilize the power of experiment tracking without having to flounder with many tools. Head over to the Neptune-Jupyter Notebooks docs to get started in the easiest way possible.

With the Neptune-Jupyter integration, you can:

  • Log and display notebook checkpoints either manually or automatically during model training.
  • Connect notebook checkpoints with model training runs in Neptune.
  • Organize checkpoints with names and descriptions.
  • Browse checkpoints history across all Notebooks in the project.
  • Compare notebooks side-by-side, with diffs for source, markdown, output, and execution count cells.
  • Share notebook checkpoints or diffs with persistent links.
  • Download notebook checkpoints directly from Neptune or Jupyter.

Here’s an open example in the Neptune app with a few notebooks logged.

Why should you use Neptune with Jupyter Notebook?

The aforementioned features make Neptune a great choice for tracking and versioning experiments with the Jupyter Notebook. Here’s what makes it a top contender for the role apart from the technical features we discussed in the last section:

  1. Seamless integration: With the Neptune-Jupyter extension, you can seamlessly integrate your Notebook with the Neptune dashboard achieving versioning and sharing capabilities. This reduces friction as compared to other methods.
  2. An abundance of features: Features offered by Neptune give you the freedom to monitor/log/store/compare whatever you want to make your experiment successful.
  3. Availability of free tier: A free tier is available for single users and offers important features at no cost.
  4. User and customer support: Thanks to the quick and helpful support team, you can get your problems fixed at a faster pace and only focus on building models.

You’ve reached the end!

Congratulations! You are now fully equipped to understand what you require in terms of your ideal method to achieve organization in your Notebook experiments. In this article, we explored straightforward ad-hoc methods like Spreadsheets and Git, as well as more nuanced approaches like experiment tracking tools. Here are some more bonus tips to help you choose your next tool easily:

  1. Stick to what you need! It’s easier to get lost in the sea of tools and methods, but absolutely sticking to your requirements would help you make better decisions.
  2. I’d recommend using the “Try for Free” feature in every tool before you lock in on any single solution.

Thanks for reading! Stay tuned for more! Adios!

Was the article useful?

Thank you for your feedback!

Thanks for your vote! It's been noted. | What topics you would like to see for your next read?

Thanks for your vote! It's been noted. | Let us know what should be improved.

    Thanks! Your suggestions have been forwarded to our editors

    How to Version, Debug, Compare and Share Jupyter Notebooks (7)

    More about How to Version, Debug, Compare and Share Jupyter Notebooks

    Check out our product resources andrelated articles below:

    Related article 3 Takes on End-to-End For the MLOps Stack: Was It Worth It? Read more Related article Adversarial Machine Learning: Defense Strategies Read more Related article Building LLM Applications With Vector Databases Read more Related article How to Migrate From MLflow to Neptune Read more

    Explore more content topics:

    Computer Vision General LLMOps ML Model Development ML Tools MLOps Natural Language Processing Product Updates Reinforcement Learning Tabular Data Time Series

    How to Version, Debug, Compare and Share Jupyter Notebooks (2024)

    FAQs

    What is the best way to share Jupyter notebooks? ›

    If you want to share this notebook with others for development, you usually share the . ipynb file which is the editable format. You need to click on the File > Download as > Notebook (. ipynb) option in your Jupyter to download the notebook in editable format.

    What is the best way to debug a Jupyter Notebook? ›

    Set the breakpoints in the selected cell and press Alt + Shift + Enter for Windows or ⌥⇧⏎ for macOS. Alternatively, you can right-click the cell and select Debug Cell from the context menu. The Jupyter Notebook Debugger tool window opens. Debugging is performed within a single code cell.

    How do I change the version of JupyterLab in Python? ›

    Update the Python version in your JupyterLab instance
    1. Open a Cloud Environment terminal. 1.1. Click on the Cloud Environment icon in the right sidebar. ...
    2. Install Python 3.10 using the terminal. 2.1. Load the Python conda environment with the following command. ...
    3. Switch to the Jupyter 3.10 kernel. 3.1.
    Mar 15, 2024

    How to use version control in Jupyter Notebook? ›

    Steps
    1. Open the required Jupyter notebook and save the changes.
    2. From the left sidebar, click on the GitHub Versions icon.
    3. Click the Push icon to commit. A dialog opens to push commits.
    4. Add a commit message and click Save to push the commit to the GitHub repository.

    Can two people work on the same Jupyter notebook? ›

    JupyterHub is the best way to serve Jupyter notebook for multiple users. Because JupyterHub manages a separate Jupyter environment for each user, it can be used in a class of students, a corporate data science group, or a scientific research group.

    How do you link between Jupyter notebooks? ›

    Running a Jupyter Notebook from Another Jupyter Notebook

    From the left Sidebar, select and right-click on the Jupyter notebook that has to be run from another notebook. From the context menu, select Copy Path. Open the Jupyter notebook from which you want to run another notebook. Click Run.

    How can I debug more efficiently? ›

    10 of the best debugging techniques
    1. Understanding the problem. Before you start making changes to your code, it's vital to fully understand the problem you're trying to solve. ...
    2. Backtracing. ...
    3. Using debugging tools. ...
    4. Breakpoints and stepping. ...
    5. Binary search. ...
    6. Rubber ducking. ...
    7. Log analysis. ...
    8. Clustering bugs.
    Sep 25, 2023

    How do you debug efficiently in Python? ›

    Inspecting Variables and Data

    Looking at variables and data is critical to debugging Python well. Hover over variables while in the debugger to see their values. Use the Debug Console to check your code more advanced. It's a great way to understand and fix problems as you find them.

    Is JupyterLab better than Jupyter Notebook? ›

    If you are already familiar with Jupyter Notebook and prefer its simple user interface, you might want to stick with it. However, if you are looking for a more advanced and flexible environment with better support for extensions and integrated tools like the terminal and text editor, JupyterLab is the way to go.

    How to check Jupyter Notebook version? ›

    Once your Jupyter Notebook is open, you can check the Python version by creating a new cell and typing "! python --version" (without the quotes) and then running the cell. The output will show the Python version that your Jupyter Notebook is currently using.

    How do I use a specific version of Python in Jupyter Notebook? ›

    1. Open a Terminal in Jupyter Notebook or Jupyter Lab and create a virtual environment.
    2. Replace the following: ${PYTHON_VERSION} with the version of Python you want to use (ex. 3.11.5 ) ~/venvs/my_environment with the location you want to use for your virtual environment, this directory should be in your home directory.

    How do you use Jupyter Notebook efficiently? ›

    The simplest way to turn notebooks into modules is to take the Python code in your notebook and put it into a . py file. A convenient way to start is to clear your notebook's output, then download it as a . py file using the Jupyter notebook toolbar option File > Download as > Python (.

    How do I customize my Jupyter Notebook? ›

    Customize Theme in Jupyter Notebook

    jt -t [theme name] -f [font name] -fs [font size] . . . Example: Applying the different themes simultaneously and also changing the font size, code cell width.

    What is the difference between command mode and edit mode in Jupyter Notebook? ›

    In edit mode, most of the keyboard is dedicated to typing into the cell's editor. Thus, in edit mode there are relatively few shortcuts. In command mode, the entire keyboard is available for shortcuts, so there are many more.

    How do I give access to a Jupyter notebook? ›

    Adding admin users from the JupyterHub interface
    1. First, navigate to the Jupyter Notebook interface home page. ...
    2. Open the Control Panel by clicking the control panel button on the top right of your JupyterHub.
    3. In the control panel, open the Admin link in the top left. ...
    4. Click the Add Users button.

    In which formats can you share a Jupyter notebook with other users? ›

    By exporting your Jupyter Notebook to Markdown, you can easily share your analysis, findings, and code with others, making it accessible to a broader audience. Markdown files can be opened and viewed in any text editor, and they can also be converted to other formats like HTML or PDF if needed.

    How do you present a Jupyter notebook? ›

    For doing this, follow the following steps given below:
    1. Click on the “View” tab in the Jupyter Notebook.
    2. A dropdown menu will appear. Hover and select over the “Cell Toolbar” option.
    3. Another dropdown appears. Now, select the “Slideshow” option in the “Cell Toolbar” menu.
    Mar 15, 2024

    How do I share a Jupyter notebook by email? ›

    Since Jupyter files run on your local machine, you can't simply send someone a link to your notebook. Instead, you have to download the file (which takes way too long) and send it off to a teammate so they can fire it up on their machine (which also takes way too long). Unfortunately, none of your assets are included.

    Top Articles
    Latest Posts
    Article information

    Author: Stevie Stamm

    Last Updated:

    Views: 5759

    Rating: 5 / 5 (60 voted)

    Reviews: 91% of readers found this page helpful

    Author information

    Name: Stevie Stamm

    Birthday: 1996-06-22

    Address: Apt. 419 4200 Sipes Estate, East Delmerview, WY 05617

    Phone: +342332224300

    Job: Future Advertising Analyst

    Hobby: Leather crafting, Puzzles, Leather crafting, scrapbook, Urban exploration, Cabaret, Skateboarding

    Introduction: My name is Stevie Stamm, I am a colorful, sparkling, splendid, vast, open, hilarious, tender person who loves writing and wants to share my knowledge and understanding with you.