GitHub - Tharos-ux/bird_id: Small ML/DL project to identify bird by sound

# Birdcall identification

Overview

This project is about the usage of deep learning for classification, using spectrograms which encodes .mp3 files. The software is composed of a full pipeline, covering from data preprocessing to results interpretation, with one minimal example to showcase its usage. A small overview of the key functions of the project is available in the Jupyter Notebook overview_project.ipynb.

Usage

You may use the main.py file to interact with the software, and the constants.py file for tweaking parameters.

usage: main.py [-h] [-c] [-s] [-p] [-m MODEL] [-o] [-f] [-r]

options:
  -h, --help            show this help message and exit
  -c, --clean           Erase all data inside training folder
  -s, --spectrograms    Builds the spectrograms from the current data folder
  -p, --predict         Predicts unknown spectrograms from a train folder   
  -m MODEL, --model MODEL
                        Path to a saved model
  -o, --output          Model will be saved on disk for later use
  -f, --filter          Specifies spectro filtering
  -r, --resnet          Uses ResNet architecture for model

Implemented functions

NAME
    tensorflow_model

FUNCTIONS
    load_model(model_path: str) -> tuple
        Loads a model and a list of classes from a previously saved model

        Args:
            model_path (str): path to folder containing the saved files

        Returns:
            tuple: (model,list of classes)

    modeling(data_directory: str, img_height: int, img_width: int, params: dict, save_status: bool, resnet: bool, save_path='models') -> tuple
        Calls for model creation and fitting, and then evaluates metrics for this model with a test set.

        Args:
            data_directory (str): path to train
            img_height (int): size of images
            img_width (int): size of images
            params (dict): dict of params, as defined in 'constants.py'
            save_status (bool): tells if model should be saved to drive when computation ends
            resnet (bool): tells if model should be resnet
            save_path (str, optional): output path for saving model. Defaults to "models".

        Returns:
            tuple: (model, classes names)


    naive_model(img_height: int, img_width: int, params: dict, class_names: list)
        Inits a CNN-style model (keras-sequential) from given parameters

        Args:
            img_height (int): height for images
            img_width (int): width for images
            params (dict): dict of params, as defined in 'constants.py'
            class_names (list): all classes used for train

        Returns:
            Sequential: descriptions of layers as an object

    plot_metrics(metrics, classes_names: list, predictions: list, labels: list, path_to_save: str = None)
        Plots out metrics and test set results

        Args:
            metrics (history): model fitting informations
            classes_names (list): all classes used for train
            predictions (list): predicted results for test set instances
            labels (list): true results for test set instances
            path_to_save (str, optional): path where model will be stored. Defaults to None.

    prediction(entry_path: str, trained_model: keras.engine.sequential.Sequential, img_height, img_width, class_names) -> str
        Does a prediction from a img file

        Args:
            entry_path (str): path to file to test
        Args:
            trained_model (tf.model): trained model
            classes (list): list of classes used for train
            model_training_informations (history): history object containing iterations informations
            predictions (list): list of predictions for test set
            labels (list): true labels for test set
            save_status (bool): if should save to drive
            params (dict): model dict parameters (constants.py format)
            cpu_exec_time (int): processor time of fitting the model
            exec_time (int): real time for model fitting
            save_path (str): path where model will be stored

NAME
    metadata_extract

FUNCTIONS
    extract_name(input: str, output: str = 'spec_name.json')
        Extract specie name infos from metadata .csv into .json
        Args:
            input (str): metadata file
            output (str, optional): file path to store name infos. Defaults to "spec_name.json".

    extract_rating(input: str, output: str = 'rating.json')
        Extract rating infos from metadata .csv into .json
        Args:
            input (str): metadata file
            output (str, optional): file path to store rating infos. Defaults to "rating.json".

NAME
    audio_sampling

FUNCTIONS
    audio_processing(data_path: str, output_path: str, specie: str, max_spectro: int = 700, rating_max: float = 4, filter: bool = False) -> None
        Exports raw audios into pre-processed spectrograms

        Args:
            data_path (str): directory containing species folders
            output_path (str): output master directory
            specie (str): name of subfolder
            max_spectro (int, optional): limits the number of spectrograms to plot per specie. Defaults to 700.
            rating_max (float, optional): defines a target score level for audio, refering to a 'rating.json' file. Defaults to 4.
            filter (bool, optional): tells if a restrictive filter should be applied to chunks. Defaults to False.

    export_spectro(l_chunks: list, specie_name: str, filename: str, output_path: str)
        Converts audio into spectros and exports them
            /!\ SPECTROS ARE 500x400px for consistency issues --> some weren't this size without fixed params

        Args:
            l_chunks (list): list of all audio chunks to plot
          the file descriptor must refer to a directory.                                                                                                                                                                            
          If this functionality is unavailable, using it raises NotImplementedError.                                                                                                                                                
                                                                                                                                                                                                                                    
        The list is in arbitrary order.  It does not include the special                                                                                                                                                            
        entries '.' and '..' even if they are present in the directory.                                                                                                                                                             
                                                                                                                                                                                                                                    
    load_in_blocks(audio_path: str, frame_size: int = 3, limit_chunks: int = 100, filter: bool = False, overlap: float = 0.5)                                                                                                       
        Chunks audio into parts of 'frame_size' seconds                                                                                                                                                                             
                                                                                                                                                                                                                                    
        Args:                                                                                                                                                                                                                       
            entry_path (str): path to audio
            frame_size (int, optional): chunks size. Defaults to 5.
            overlap (float, optional): overlap percentage
        Returns:
            list of chunks

NAME
    main

FUNCTIONS
    futures_collector(func: Callable, argslist: list, num_processes: int) -> list
        Spawns len(arglist) instances of func and executes them at num_processes instances at time.
        
        * func : a function
        * argslist (list): a list of tuples, arguments of each func
        * num_processes (int) : max number of concurrent instances

Name		Name	Last commit message	Last commit date
Latest commit History 91 Commits
dev		dev
lib		lib
models_samples/model_01		models_samples/model_01
toy_data		toy_data
toy_unknown/toy_prediction		toy_unknown/toy_prediction
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
constants.py		constants.py
main.py		main.py
names.json		names.json
overview_projet.ipynb		overview_projet.ipynb
rating.json		rating.json
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Usage

Implemented functions

About

Releases

Packages

Contributors 2

Languages

License

Tharos-ux/bird_id

Folders and files

Latest commit

History

Repository files navigation

Overview

Usage

Implemented functions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages