Definition

A namespace is simply a dictionary of all currently defined names, and their values. Key, value pairs. The interesting part, is which symbolic name is available where and how to organise your files and methods to utilise the namespace concept as intended.

A symbolic name is something that uniquely identifies a specific entry in a program or language. In python this would be a reference (giving the memory location) or pointer to a specific object. This can be an array, a variable, an object etc...

An assignment creates a symbolic name, that can be used to reference an object

s = "foo"
print(id(s))

To check the unique identifier, you simply call {python} id(x)

In a complex python program, we have hundreds, thousands of these names. To avoid interference and to be able to manage these symbolic names, we use the concept of namespaces. As mentioned above, the namespace is a dictionary of available key, value pairs.

Types of namespaces

There are 4 types of namespaces:

  1. Built-In
  2. Global
  3. Enclosing
  4. Local

No need to go into details here, these are basic coding concepts.

Pasted image 20240921134754.png

We will just write down some best practices.

Best practices

global keyword

Do not use the {python} global keyword. It allows to modify variables inside of a function (local namespace), and make those changes on a different namespace level (global).

global variables lookups are actually slower than local ones.
If you require to pass around variables, encapsulate them in a class or config object.

Mutables vs Immutable objects

don't worry too much about performance, in most cases the category mutables vs immutables its negligeable (choosing the right type is not!)

How to modify global types inside of local namespaces: Mutable types

If you wish to modify objects inside of functions and make those changes last, most of the time there is no need to return that object. Just use mutable types (python) instead of immutable types (python)

# option 1 is better
def modify_mutable_option1(dummyClass):
    dummyClass.counter += 1

dummyClass1 = DummyClass()
modify_mutable_option1(dummyClass1)

# option 2
def modify_mutable_option2(dummyClass):
    dummyClass.counter += 1
    return dummyClass

dummyClass2 = DummyClass()
dummyClass2 = modify_mutable_option2(dummyClass2)
Just use option 1, its faster, shorter and better.
It should be expected that mutable objects can be modified inside of functions, this should however still be documented!!

Use immutable objects for fixed data

It prevents bugs and makes code safer.

Explore frozensets for the immutable, readonly version of sets

Multithreading

If you're using multithreading, immutables have the advantage of being threadsafe.

Example:

Not thread safe:
{python}config = {"setting1": True, "setting2": 10} # mutable object

Thread safe:
{python} config = (true, 10) # immutable object

Threads can read it safely without worrying about one thread changing the object while another is reading it.

Issues arising in data science:

In Data science

Python interactive windows are messy, using sys.path.append is okay. As long as it is not production code.

In Data science, during the exploratory phase we often have the following folder structure:

project/
│
└──current_approach.py
└──utils.py
|
├── approachTypeA/
│   └── A1.py
|	└── A2.py
│
└── approachTypeB/
    └── B1.py
    └── B2.py

Previous Approaches are encapsulated inside of each file. However they use functions inside of utils.py. They cannot locally import utils.py because it is not in the Pythonpath. In this case, because it is not production code and the issue gets accentuated due to the usage of the python interactive window, therefore it is acceptable to manually append to the python path.

Hacky Solution: append the system path

import sys
from pathlib import Path

# Get the parent directory of the current file
parent_dir = Path(__file__).resolve().parent.parent

# Append the parent directory to the system path
sys.path.append(str(parent_dir))

Better Solution: Create a python package and install it in edit mode:

Python packaging

Relationship with packages

Namespaces is one of the reasons we use packages. The above issue (being unable to import from the parent folder) can also arise if you have multiple entrypoints to an app. This can happen if you use a testing framework, or a webserver with multiple entrypoints.

The solution for this are "editable installs": {shell}python -m pip install -e path/to/SomeProject

  1. Create a pip package from your project
  2. Install it in editable mode
  3. Import from anywhere inside of the project as you please

See Python packaging