\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
\*\*\*\*\*\*\*\*\*\*\*\*\*\* English Below \*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
This folder contains a dataset of 10 European bird species, carefully labeled and mixed with other sound samples at a controlled signal-to-noise ratio (SNR).
----------------------------------------
Directory structure
The project is organized into four main directories:
- File_origin
- File_cleaned_normalised
- File_mixed
- Programs_used
----------------------------------------
đ File_origin
âââ BirdOnly
This folder contains raw samples from the Xeno-Canto database (only recordings rated *A* or *B*).
Each subfolder corresponds to a labeled species class.
- Total number of classes: 10 (European species only).
- Each `.mp3` file has a corresponding annotation file with the same name and the `_Annotation.xml` extension.
Annotation format:
start time of the vocalization (in seconds)
end time of the vocalization (in seconds)
minimum frequency of the vocalization
maximum frequency of the vocalization
minimum amplitude (between 0 and -0.5)
maximum amplitude (between 0 and 0.5)
class identifier
The mapping between identifiers and species names is defined in the file: labelMap.xml
> đž Note:
> The `_More_Classes` folder contains 10 additional classes, but their annotations are automatically generated and may be inaccurate (not human-validated).
----------------------------------------
âââ Noise-ESC-50
Sound samples (5 seconds each) from the ESC-50 dataset, containing non-bird sounds.
âââ Noise-Pixabay
Sound samples of rain and wind from the Pixabay platform.
âââ Noise-Quebec
Sound samples of forest noise recorded in Quebec.
----------------------------------------
đ File_cleaned_normalised
âââ Bird
Contains bird samples from *File_origin*, cleaned from their environmental background.
(*Annotation files remain unchanged.*)
âââ Noise
Contains noise samples from *File_origin*, cleaned and normalized.
----------------------------------------
đ File_mixed đč (Probably the folder you are looking for) đč
This folder contains all bird samples mixed with environmental and non-bird sounds,
in a biologically consistent manner and with controlled SNR.
----------------------------------------
đ Programs_used
List and description of scripts used to:
- extract files via the Xeno-Canto API,
- clean the recordings according to their annotations,
- and mix the bird sounds with the selected noise samples.
\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
\*\*\*\*\*\*\*\*\*\*\*\* Explications Français \*\*\*\*\*\*\*\*\*\*\*\*\*
\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*\*
Ce dossier contient un dataset de 10 espĂšces dâoiseaux europĂ©ens, soigneusement labellisĂ©es et mixĂ©es avec dâautres Ă©chantillons sonores Ă rapport signal/bruit (SNR) contrĂŽlĂ©.
----------------------------------------
Structure du dossier
Le projet est organisé en quatre répertoires principaux :
- File_origin
- File_cleaned_normalised
- File_mixed
- Programs_used
----------------------------------------
đ File\_origin
âââ BirdOnly
Ce dossier contient les échantillons bruts issus de la base Xeno-Canto (seulement les enregistrements de score A ou B).
Chaque sous-dossier correspond Ă une classe dâespĂšce labellisĂ©e.
- Nombre total de classes : 10 (espÚces européennes uniquement).
- Chaque fichier audio .mp3 possĂšde un fichier dâannotation associĂ© portant le mĂȘme nom et lâextension _Annotation.xml.
Format dâannotation :
start time of the vocalization (in seconds)
end time of the vocalization (in seconds)
minimum frequency of the vocalization
maximum frequency of the vocalization
minimum amplitude (between 0 and -0.5)
maximum amplitude (between 0 and 0.5)
class identifier
La correspondance entre les identifiants et les noms dâespĂšces est dĂ©finie dans le fichier : *labelMap.xml*
đž Remarque : Le dossier \_More\_Classes contient 10 classes supplĂ©mentaires, mais leurs annotations sont automatiques et potentiellement erronĂ©es (non validĂ©es humainement).
âââ Noise-ESC-50
Ăchantillons de bruits non ornithologiques (5 secondes chacun) issus de la base ESC-50.
âââ Noise-Pixabay
Ăchantillons de pluie et vent provenant de la plateforme Pixabay.
âââ Noise-Quebec
Ăchantillons de bruits de forĂȘt enregistrĂ©s au QuĂ©bec.
----------------------------------------
đ File\_cleaned\_normalised
âââ Bird
Contient les Ă©chantillons dâoiseaux issus de *File\_origin*, nettoyĂ©s de leur environnement sonore. *Les fichiers dâannotation sont inchangĂ©s.*
âââ Noise
Contient les échantillons de bruit issus de *File\_origin*, également nettoyés et normalisés.
----------------------------------------
đ File\_mixed đč (Probablement le dossier que vous recherchez) đč
Ce dossier contient lâensemble des Ă©chantillons dâoiseaux mixĂ©s avec des bruits environnementaux et non ornithologiques, de maniĂšre biologiquement cohĂ©rente et Ă SNR contrĂŽlĂ©.
----------------------------------------
đ Programs\_used
Liste et description des scripts utilisés pour :
- extraire les fichiers via lâAPI Xeno-Canto,
- nettoyer les enregistrements selon leurs annotations,
- et mĂ©langer les sons dâoiseaux avec les bruits souhaitĂ©s.