Skip to content

hageldave/NPYScatter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NPYScatter

Desktop application for viewing .npy (numpy array) files in a scatter plot.

# basic usage
npyscatter example_data/iris_data.npy

# select dimensions, colorize
npyscatter example_data/iris_data.npy -x 0 -y 2 --color-values example_data/iris_labels.npy --cmap SET2
npyscatter

Table of Contents

Usage

npyscatter <point_coordinates.npy> [options] or
npyscatter [options] <point_coordinates.npy> or
npyscatter [options] <point_coordinates.npy> [options]

Interactive Controls

The coordinate system can be moved around and zoomed in and out, and a rectangular selection of points can be made (triggering highlighting).

Action Key
Panning CTRL + LMB
Zooming ALT + SCROLL
Selecting SHIFT + LMB

CLI Options

Option Description
-h, --help Print help message and exit.
-x, --x-idx <N> Column index for X axis (default: 0).
-y, --y-idx <N> Column index for Y axis (default: 1).
-p, --point-size <N> Point glyph scaling factor.
-s, --size <N,N> Size of the canvas <Width,Height>.
-v, --view <N,N,N,N> Coordinate view limits (view port) <MinX,MaxX,MinY,MaxY>. Make sure the argument is properly escaped so that negative values are not recognized as options (e.g. '"-1,1,-1,1"'). Defaults to bounding box of data if not provided.
-o, --output <path> Path to output file (*.png, *.svg, *.pdf). Non-interactive, export then exit.
-i, --ipc-file <path> Path to IPC file for selection exchange.
--color-values <path> Path to .npy file with values to be mapped to color.
--color-value-idx <N> Column index in color-values array (default: 0).
--cmap <name> Color map name (default: S_TURBO). Color map names are prefixed with S, D, Q indicating their type sequential, diverging, qualitative (discrete). Based on the type and color-values array, different mapping strategies are applied.
--cmap-list List available color maps and exit.
--cmap-show Shows available color maps in a GUI.
--x-label <name> Label for X axis. Default is 'Dim N' where N is the x-idx.
--y-label <name> Label for Y axis. Default is 'Dim N' where N is the y-idx.
--jitter <N> Add jitter to scatter points. Value in pixels.
--draw-order <path/seed> Path to .npy file with point index ordering OR random seed to generate a permutation (long, '0x' prefix for hex otherwise decimal).
--no-axes Hide coordinate system. View stretches over whole canvas.
--fallback Use JPlotter fallback canvas. Use when OpenGL is not supported (e.g. MacOS).

Color Mapping

The available color maps are those shipped with the JPlotter libarary.

Prefix/Type Array characteristics Strategy
S sequential * [min,max] range of values is mapped with color interpolation
D diverging values >= 0 same behavior as S
D diverging 0 >= values same behavior as S
D diverging * value 0 is used as diverging point, [-abs(values), +abs(values)] range is mapped with color interpolation
Q qualitative integers every value is mapped to a color of the map, no interpolation, colors are repeated when there are more distinct values than colors
Q qualitative integers, min == -1 same as general integer case, but -1 is mapped to a transparent magenta color indicating invalid/noise cluster
Q qualitative * every distinct value is mapped to a color, no interpolation, colors are repeated when there are more distinct values than colors

Installation

NPYScatter is a Java application which is built with Maven. To build it, use

mvn clean install

which will compile the code and assemble a runnable .jar file in a newly created subdirectory target/. The application can then be run via java -jar target/npyscatter-0.0.1-SNAPSHOT-jar-with-dependencies.jar.

Note: It is recommended to create a script (bash/powershell, depending on OS) that contains this command and make it available globally. This way you can use a concise abbreviation like npyscatter as used in this readme. When updating your build, nothing needs to be moved or replaced.

Ubuntu Script Example

Create a file npyscatter, containing

#!/usr/bin/env bash
java -jar ~/git/NPYScatter/target/npyscatter-0.0.1-SNAPSHOT-jar-with-dependencies.jar $*

make it executable, then move it to ~/.local/bin/ which should be on your PATH by default.

chmod a+x npyscatter
mv npyscatter ~/.local/bin/ # or ~/bin/ or another directory included on $PATH that you can write to

Brushing & Linking

NPYScatter implements file-based inter process communication (IPC) for brushing and linking. It can write the currently selected point indices to a .npy or text file whenever the selection changes. Other applications can watch this file and react accordingly (e.g. for brushing & linking across views). Infact, NPYScatter watches this file and updates the highlighted points on change accordingly. This mechanism allows multiple instances of NPYScatter to be linked easily when they share the same selection file.

Usage

Pass the file as the -i or --ipc-file option's argument when launching:

npyscatter data.npy --ipc-file /tmp/selection.npy

The file contains a 1D int32 NumPy array of the selected point indices. An empty selection writes an array of shape (0,).

A plain text file can also be used.

npyscatter data.npy -i selection.txt

Example Snippets

While NPYScatter already watches the selection file and reacts to changes, here are some example snippets to use for your own code to get you started.

Java consumer

import org.jetbrains.bio.npy.NpyFile;
import java.nio.file.*;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicBoolean;

Path ipcFile = Path.of("/tmp/selection.npy");
WatchService watcher = FileSystems.getDefault().newWatchService();
ipcFile.getParent().register(watcher, StandardWatchEventKinds.ENTRY_MODIFY, StandardWatchEventKinds.ENTRY_CREATE);

AtomicBoolean stopRequested = new AtomicBoolean(false);

System.out.println("Watching for selection changes...");
while (!stopRequested.get()) {
    WatchKey key = watcher.poll(100, TimeUnit.MILLISECONDS); // returns null on timeout
    if (key == null) {
        Thread.yield();
        continue; // re-check stopRequested
    }
    for (WatchEvent<?> event : key.pollEvents()) {
        Path changed = (Path) event.context();
        if (changed.equals(ipcFile.getFileName())) {
            int[] indices = NpyFile.read(ipcFile).asIntArray();
            System.out.println("Selection updated: " + indices.length + " points");
            // TODO: react to the new selection
        }
    }
    key.reset();
    Thread.yield();
}
watcher.close();

Note: WatchService watches the parent directory for changes, then filters by filename. Both ENTRY_CREATE and ENTRY_MODIFY must be registered because NPYScatter writes selections via an atomic rename (temp file → target): this triggers an ENTRY_CREATE event on the target file, not ENTRY_MODIFY.

watcher.poll(timeout, unit) is used instead of watcher.take() so that the stopRequested condition is checked regularly even when no events arrive, rather than blocking indefinitely.

Python consumer

import numpy as np
import time, os

ipc_file = "/tmp/selection.npy"
last_mtime = None

while True:
    try:
        mtime = os.path.getmtime(ipc_file)
        if mtime != last_mtime:
            last_mtime = mtime
            indices = np.load(ipc_file)
            print(f"Selection updated: {len(indices)} points → {indices}")
    except FileNotFoundError:
        pass  # file not written yet
    time.sleep(0.1)

About

Application for viewing .npy (numpy array) files in a scatter plot

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages