README.md 24.9 KB
Newer Older
VERNETTE Caroline's avatar
VERNETTE Caroline committed
1
## Ocean Barcode Atlas
root's avatar
root committed
2

VERNETTE Caroline's avatar
VERNETTE Caroline committed
3
4
Source code for  
- Public website  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
5
https://oba.mio.osupytheas.fr/ocean-atlas/  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
6
7
(in @tara-oceans://tara-data : wimg_laravel_oba)  
- Development website   
VERNETTE Caroline's avatar
VERNETTE Caroline committed
8
https://oba.mio.osupytheas.fr/ocean-atlas_dev/  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
9
10
11
12
(in @tara-oceans://tara-data : wimg_laravel_oba_dev)  
- Private website (password and login not mentioned for security reasons)  
https://oba.mio.osupytheas.fr/ocean-atlas-private/  
(in @tara-oceans://tara-data : wimg_laravel_oba_private)  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
13
- [User guide](https://oba.mio.osupytheas.fr/ocean-atlas/build/pdf/Ocean-Barcode-Atlas_User_Manual.pdf)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
14
15

### Table of Contents
VERNETTE Caroline's avatar
VERNETTE Caroline committed
16
- [Git](#git)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
17
18
19
20
21
22
  * [Push modify the code](#push-modify-the-code)
  * [Pull update code](#pull-update-code)
  * [Merge validate your changes](#merge-validate-your-changes)
  * [Clone to server](#clone-to-server)
  * [Git error](#git-error) 
  * [Update some files](#update-some-files)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
23
24
- [ Laravel](#laravel)
  * [Laravel framework](#laravel-framework)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
25
  * [Laravel update](#laravel-update)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
26
  * [Learning Laravel](#learning-laravel)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
27
- [Debugging](#debugging)  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
28
- [OBA insert new dataset](#oba-insert-new-dataset)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
29
  * [Stations samples and contextual data](#stations-samples-and-contextual-data)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
30
31
32
  * [Barcodes](#barcodes)
    + [Dataset barcodes](#dataset-barcodes)
    + [Reference barcodes](#reference-barcodes)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
33
    + [BLAST and VSearch alignment](#blast-and-vsearch-alignment)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
34
35
    + [Json files](#json-files)
  * [DATABASE](#database)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
36
  * [Integration in laravel](#integration-in-laravel)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
37
38
    + [wimg.php](#wimgphp)
    + [database.php](#databasephp)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
39
- [Barcode catalogs](#barcode-catalogs)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
40
41
  
# Git
VERNETTE Caroline's avatar
VERNETTE Caroline committed
42
43
All the code is present on the git except for the files mentioned in .gitignore.     
To modify the code:  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
44
45
on the local machine  `git clone http://gitlab.osupytheas.fr/ocean_atlas/oba.git`  
his creates an oba folder  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
46
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
47
cd oba
VERNETTE Caroline's avatar
VERNETTE Caroline committed
48
49
50
51
git branch
*master
 oba_dev
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
52
The master branch corresponds to the public site and to the private site, the oba_dev branch to the development site.   
VERNETTE Caroline's avatar
VERNETTE Caroline committed
53
For change branch: `git checkout oba_dev` 
VERNETTE Caroline's avatar
VERNETTE Caroline committed
54
55
56

## Push modify the code 
Now you can modify the code. To index, create a version and send the modifications to the remote repository (gitlab)([tuto](https://openclassrooms.com/fr/courses/7162856-gerez-du-code-avec-git-et-github/7165726-travaillez-depuis-votre-depot-local-git)):
VERNETTE Caroline's avatar
VERNETTE Caroline committed
57
58
59
60
61
```
git add . 
git commit -m "comment" 
git push 
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
62
63
64

## Pull update code
To update the code on the server:     
VERNETTE Caroline's avatar
VERNETTE Caroline committed
65
log in to the server    
VERNETTE Caroline's avatar
VERNETTE Caroline committed
66
67
68
69
``` 
cd /tara-data/wimg_laravel_oba_dev 
sudo git pull 
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
70
⚠️ Attention to the permission problems (root & www-data:www-data [command lines](https://gitlab.osupytheas.fr/ocean_atlas/oba/-/blob/master/doc/permission_problems.sh))  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
71
Check on http://oba.mio.osupytheas.fr/ocean-atlas_dev/ if everything is ok.  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
72
<p align="center"><img src="https://gitlab.osupytheas.fr/ocean_atlas/oba/raw/master/doc/git50.svg"></p>
VERNETTE Caroline's avatar
VERNETTE Caroline committed
73

VERNETTE Caroline's avatar
VERNETTE Caroline committed
74
75
## Merge validate your changes
When the changes on the oba_dev branch are finished and they have been tested, you must merge the oba_dev branch on the master branch.  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
On the local machine:
```
git checkout master
git merge oba_dev
git push
```
⚠️ If the merges are not done regularly, this leads to conflicts. If that is the case:
```
git mergetool
#case by case resolution
git status
git commit
git push
```
 
VERNETTE Caroline's avatar
VERNETTE Caroline committed
91
92
## Clone to server
If you have a problem on the server (during updates for example) and you want to clone a branch of git, identify the branch (oba_dev for the dev version and master for the public version). The following lines of code concern the dev service:
VERNETTE Caroline's avatar
VERNETTE Caroline committed
93
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
94
95
96
97
98
99
100
101
102
sudo git clone -b oba_dev http://gitlab.osupytheas.fr/ocean_atlas/oba.git
sudo mv oba wimg_laravel_oba_git
sudo cp wimg_laravel_oba_dev/.env wimg_laravel_oba_git/
sudo cp wimg_laravel_oba_dev/public/.htaccess wimg_laravel_oba_git/public/
sudo cp -r wimg_laravel_oba_dev/storage wimg_laravel_oba_git
sudo cp -r wimg_laravel_oba_dev/public/tmp wimg_laravel_oba_git/public/
sudo cp -r wimg_laravel_oba_dev/public/build/pdf wimg_laravel_oba_git/public/build/
sudo cp -r wimg_laravel_oba_dev/vendor wimg_laravel_oba_git/
cd wimg_laravel_oba_git
VERNETTE Caroline's avatar
VERNETTE Caroline committed
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
sudo chown -R www-data:www-data storage/logs/
sudo chown -R www-data:www-data storage/framework/
sudo chown -R www-data:www-data storage/tmp/
sudo chown -R www-data:www-data bootstrap/cache/
sudo chown -R www-data:www-data public/tmp/
sudo chown -R www-data:www-data public/build/pdf/
sudo chown -R www-data:www-data storage/compute_cache/
cd ..
sudo mv wimg_laravel_oba_dev wimg_laravel_oba_dev_prec
sudo mv wimg_laravel_oba_git wimg_laravel_oba_dev
cd wimg_laravel_oba_dev
sudo php artisan key:generate
sudo php artisan config:cache
sudo php artisan view:clear
sudo php artisan cache:clear
#test on the web service if there is a problem:
cd ..
VERNETTE Caroline's avatar
VERNETTE Caroline committed
120
sudo mv wimg_laravel_oba_dev wimg_laravel_oba_dev_git
VERNETTE Caroline's avatar
VERNETTE Caroline committed
121
122
sudo mv wimg_laravel_oba_dev_prec wimg_laravel_oba_dev
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
123
124
125
126
127
128
129
130
131
132
For public version you need to add command lines:
```
sed -i 's\<b><font color=red size=3em>Confidential version\<!--<b><font color=red size=3em>Confidential version\' resources/views/OBA/oba.blade.php
sed -i 's\internal use</font></b><br>\internal use</font></b><br>--!>\' resources/views/OBA/oba.blade.php
sed -i 's\#private\/*#private\' config/wimg.php
sed -i 's\#end private\*/#end private\' config/wimg.php
sudo php artisan config:cache
sudo php artisan view:clear
sudo php artisan cache:clear
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
133
134
135

## Git error 
To lists in reverse chronological order the commits made:   
VERNETTE Caroline's avatar
VERNETTE Caroline committed
136
137
138
139
140
`git log`  
Displays a SHA-1 identifier for each action.   
To return to a given action, we take the first 8 characters of its SHA and we do:  
`git checkout the_first_8_characters`.  

VERNETTE Caroline's avatar
VERNETTE Caroline committed
141
142
## Update some files
Update readme & doc folder on oba_dev branch (on the local machin)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
143
144
145
146
147
148
149
150
151
```
cd oba
git checkout oba_dev
git checkout --patch master README.md
git checkout --patch master doc
git add .
git commit -m "add doc and update readme in oba_dev branch"
git push
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
152

VERNETTE Caroline's avatar
VERNETTE Caroline committed
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
# Laravel
Laravel is a web application framework with expressive, elegant syntax. We believe development must be an enjoyable, creative experience to be truly fulfilling. Laravel attempts to take the pain out of development by easing common tasks used in the majority of web projects, such as:

- [Simple, fast routing engine](https://laravel.com/docs/routing).
- [Powerful dependency injection container](https://laravel.com/docs/container).
- Multiple back-ends for [session](https://laravel.com/docs/session) and [cache](https://laravel.com/docs/cache) storage.
- Expressive, intuitive [database ORM](https://laravel.com/docs/eloquent).
- Database agnostic [schema migrations](https://laravel.com/docs/migrations).
- [Robust background job processing](https://laravel.com/docs/queues).
- [Real-time event broadcasting](https://laravel.com/docs/broadcasting).

Laravel is accessible, yet powerful, providing tools needed for large, robust applications. A superb combination of simplicity, elegance, and innovation give you tools you need to build any application with which you are tasked.

<p align="center"><img src="https://gitlab.osupytheas.fr/ocean_atlas/oba/raw/master/doc/Model_View_Controller50.svg"></p> 

## Laravel framework
- App folder -> working code  
 app/Console/Commands: all commands in console mode  
 app/Http/Controllers: controllers will contain logic to process client requests    
 app/Http/Controllers/API:  logic to process API requests   
 app/Http/Middleware: AdminAccess.php add your IP to access https://oba.mio.osupytheas.fr/ocean-atlas/manageOgaJobs?jobnb=50
- config/ -> framework configuration  
 database.php & wimg.php : to modify when adding a dataset (see [Integration in laravel](#integration-in-laravel))  
 When modif   
 ```
 sudo php artisan config:cache
 sudo php artisan view:clear
 sudo php artisan cache:clear
 ```
- doc/  -> not in laravel base structure, documentation 
- public/ -> accessible from the front  
 public/build/ : contains images, CSS code, javascript and json files (here involved in taxonomy auto-completion)   
 public/sum_station_abundance : contains the sum of the abundances contained in each station for each dataset   
 public/vendors : contains JavaScript libraries   
- resources/views -> visible part of a graphical interface, laravel uses the "blade" template engine
- routes/ -> management of application entry urls  
 api.php : for API & web.php : for controllers
- storage/ -> contains temporary application data: compiled views, caches, session keys, etc.
- env -> The .env files contain the passwords of your services as well as all the sensitive data of your application (database password, address of the database…). This file should never be shared.

VERNETTE Caroline's avatar
VERNETTE Caroline committed
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
## Laravel update
To be done approximately every 6 months.   
Follow the available documentation: https://laravel.com/docs/    
Often it is necessary to modify the file composer.json, then on the tara server in /tara-data :
```
sudo chown -R user_name:user_name wimg_laravel_to_complete
cd wimg_laravel_to_complete
composer update
composer dumpautoload
composer update

php artisan route:cache
php artisan route:clear

sudo chown -R www-data:www-data storage/logs/
sudo chown -R www-data:www-data storage/framework/
sudo chown -R www-data:www-data storage/tmp/
sudo chown -R www-data:www-data bootstrap/cache/
sudo chown -R www-data:www-data public/tmp/
sudo chown -R www-data:www-data storage/compute_cache/
sudo chown -R www-data:www-data public/build/pdf

sudo php artisan config:cache
sudo php artisan config:clear
sudo php artisan view:clear
sudo php artisan cache:clear
sudo php artisan optimize:clear
```

VERNETTE Caroline's avatar
VERNETTE Caroline committed
222
223
224
225
226
227
228
229
230
231
232
233
234
235
## Learning Laravel
Laravel has the most extensive and thorough documentation and video tutorial library of any modern web application framework. The [Laravel documentation](https://laravel.com/docs) is thorough, complete, and makes it a breeze to get started learning the framework.

If you're not in the mood to read, [Laracasts](https://laracasts.com) contains over 900 video tutorials on a range of topics including Laravel, modern PHP, unit testing, JavaScript, and more. Boost the skill level of yourself and your entire team by digging into our comprehensive video library.

## Contributing
Thank you for considering contributing to the Laravel framework! The contribution guide can be found in the [Laravel documentation](http://laravel.com/docs/contributions).

## Security Vulnerabilities
If you discover a security vulnerability within Laravel, please send an e-mail to Taylor Otwell at taylor@laravel.com. All security vulnerabilities will be promptly addressed.

## License
The Laravel framework is open-sourced software licensed under the [MIT license](http://opensource.org/licenses/MIT).

VERNETTE Caroline's avatar
VERNETTE Caroline committed
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
# Debugging
Server architecture:
<p align="center"><img src="https://gitlab.osupytheas.fr/ocean_atlas/oba/raw/master/doc/Archi_OBA50.svg"></p> 

- The data and the database are backed up monthly on a remote server. If the tara server (mioprox8) goes down the SIP can restart the service.   
- To debug go to the folder wimg_laravel_oba_dev and `tail -n 150 storage/logs/laravel.log`  
- To restart apache `sudo service apache2 restart`  
- To restart MariaDB `sudo systemctl restart mysql`  
- If you need to intervene on Laravel and stop the service:  
```
sudo php artisan down --message="Sorry the OBA service is down, we're working on it..."  #for stop with message  
sudo php artisan up #for restart  
```
- In the code to create debugs in laravel.log:  
 log::debug() in php  
 console.log() in javascript  

VERNETTE Caroline's avatar
VERNETTE Caroline committed
253
# OBA insert new dataset  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
254
255
256
257
258
259
260
261
262
263
264
265
266
267
Define the name of the dataset here XXX  

Ideally, the input file should be in the form:
|ASV_id|ASV_seq|taxonomy|reference_id|% id|nb reads/sample1|nb reads/sample2|...|
|------|-------|--------|------------|----|----------------|----------------|---|

Identify the reference database and its version, often it is Silva for prokaryotes and pr2 for eukaryotes.  
Know the primers used to extract the barcodes (primer reverse & primer forward).  

On the Tara server (/tara-data) create a db_XXX directory, put the input file there.  
Copy/paste the scripts present in the folder of another db_YYY.
 
⚠️ Please note that they must be modified (in particular the names of the input and output files, but not only…)

VERNETTE Caroline's avatar
VERNETTE Caroline committed
268
## Stations samples and contextual data 
VERNETTE Caroline's avatar
VERNETTE Caroline committed
269
⚠️⚠️⚠️ The order of the samples must correspond to the header of the abundance matrix.
VERNETTE Caroline's avatar
VERNETTE Caroline committed
270
- If the dataset comes from the Tara Ocean campaign  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
271
The scripts and a workflow are available in [barcode_dada2 pipeline](https://gitlab.osupytheas.fr/ocean_atlas/barcode_dada2/-/tree/main/3_metadata_TARA).   
VERNETTE Caroline's avatar
VERNETTE Caroline committed
272
It takes as input sample ids in the form ‘TARA_N000001608’ , 1 id/line.  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
273
274
Recover these identifiers in the header of the input file if these ids are not in the form TARA_NXXXX there are correspondence files.  Attention it is always necessary to check if all the samples are present and that they are not duplicated.  
The output files are: XXX_sample_metadata_definition, XXX_sample_metadata.tsv, XXX_sample.tsv.  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
275
These files allow the sample data to be integrated into the mysql database.  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
276
277
- If the dataset comes from the MOOSE-GE campaign  
The scripts and a workflow are available in [barcode_dada2 pipeline](https://gitlab.osupytheas.fr/ocean_atlas/barcode_dada2/-/tree/main/3_metadata_MOOSE).   
VERNETTE Caroline's avatar
VERNETTE Caroline committed
278
You need two files:
VERNETTE Caroline's avatar
VERNETTE Caroline committed
279
280
281
282
283
284

  - sample_xlsx: file filled in during the MOOSE campaign
  - sample_ifremer: http://donnees-campagnes.flotteoceanographique.fr/  
  enter campaign name : MOOSE-GE*    
  in "instruments" (lower left facet) : check Bouteilles  
  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
285
Command line:`sbatch metadata_MOOSE.sh mail project_name sample_xlsx sample_ifremer`
VERNETTE Caroline's avatar
VERNETTE Caroline committed
286

VERNETTE Caroline's avatar
VERNETTE Caroline committed
287
- Otherwise 
VERNETTE Caroline's avatar
VERNETTE Caroline committed
288
You have to build these three files.  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
289
290

sample_metadata.tsv
VERNETTE Caroline's avatar
VERNETTE Caroline committed
291
292
293
294
295
|sample_id|metadata_id|numeric_value|categoric_value|
|---------|-----------|-------------|---------------|
|1|	1	|\N	|MGE18_01_5_02_DNA|

sample_metadata_definition.tsv
VERNETTE Caroline's avatar
VERNETTE Caroline committed
296
297
|metadata_id|name|description|unity|category/numeric|
|-----------|----|-----------|-----|----------------|
VERNETTE Caroline's avatar
VERNETTE Caroline committed
298
|1|	OBA_ID	|Sample material|		|category|
VERNETTE Caroline's avatar
VERNETTE Caroline committed
299

VERNETTE Caroline's avatar
VERNETTE Caroline committed
300
sample.tsv
VERNETTE Caroline's avatar
VERNETTE Caroline committed
301
302
|sample_id|sample_name1|sample_name2|station|depth|size_min|size_max|latitude|longitude|
|---------|------------|------------|-------|-----|--------|--------|--------|---------|
VERNETTE Caroline's avatar
VERNETTE Caroline committed
303
|1|	MGE18-01-5-02-DNA	|MGE18-01-5-02-DNA|	1|	SRF|	2.2|	3|	43.3801|	7.2296|
VERNETTE Caroline's avatar
VERNETTE Caroline committed
304
305
306
307
308
309
310

## Barcodes
### Dataset barcodes
- The abundances must be rarified with R. This consists of basing ourselves on the number of minimum reads present in the samples to make a random draw on all the samples. We need these values ​​for beta diversity.  

```console
library(data.table)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
311
library(magrittr)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
library(vegan)
library(ggplot2)
library(stringr)

setwd("../../projects/mio_oga")
getwd()

#####OTU 18SV9 v3####################
data<-fread("TARA-Oceans_18S-V9_Swarm_table.tsv")
otu_names<-data[,1]
rownames(data)<-data$V1
data<-data[,- c(1:6)]  #select only columns containing reads
data<-t(data)
read_number <- apply(data,1,sum)
min_read_number <- min(read_number)
min_read_number
VERNETTE Caroline's avatar
VERNETTE Caroline committed
328
#624514
VERNETTE Caroline's avatar
VERNETTE Caroline committed
329
330
331
332
data_rarefied <- rrarefy(data,min_read_number)
data_rarefied_t=t(data_rarefied)
data_rarefied_t=cbind(otu_names,data_rarefied_t)
write.table(data_rarefied_t,"18SV9v3_data_rarefied",quote=FALSE,sep="\t",row.names = FALSE)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
333
334
335
```

- Transfer the data_rar.csv file to the directory created on the tara server.
VERNETTE Caroline's avatar
VERNETTE Caroline committed
336
Rename the script [XXX_gene_abun_table.pl](https://gitlab.osupytheas.fr/ocean_atlas/oba/-/blob/master/doc/XXX_gene_abun_table.pl) and adapt it to the input file.
VERNETTE Caroline's avatar
VERNETTE Caroline committed
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
The script is in pj and it is commented. Command line :
```console
./XXXX_gene_abun_table.pl input file
```
This script generates 5 output files:  
1. XXXX.fna → fasta file with id/taxon and seq for blast and vsearch alignment  
2. XXXX_gene.tsv → to fill gene table  
3. XXXX_gene_abundance.tsv → to fill gene_abundance table (nb read and nb read rarefied)  
4. taxon_dif → to create the json file which allows to propose the taxo on the fly  
5. XXXX_sum_abundance.json → for normalization calculating abundances  

### Reference barcodes
Download the reference database in fasta format (XXX_ref.fna), pay attention to the version used. Often Silva or pr2.  
Silva: https://www.arb-silva.de/download/archive/  
PR2: https://github.com/pr2database/pr2database/  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
352
EukRibo: https://zenodo.org/record/6327891#.Y3Js_L6ZOXk  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
353
Sometimes it is necessary to ask the people who generated the data because it has been modified.
VERNETTE Caroline's avatar
VERNETTE Caroline committed
354
Run the [refDB.pl](https://gitlab.osupytheas.fr/ocean_atlas/oba/-/blob/master/doc/refDB.pl) scripts (in attachment) with the fasta file (XXX_ref.fna). Creates two output files:
VERNETTE Caroline's avatar
VERNETTE Caroline committed
355
356
357
XXX_reftaxon.tsv→ to populate the reftaxon table
ref_dif→ to create the json file which allows to propose the taxo on the fly

VERNETTE Caroline's avatar
VERNETTE Caroline committed
358
### BLAST and VSearch alignment
VERNETTE Caroline's avatar
VERNETTE Caroline committed
359
360
361
362
363
364
365
366
367
368
- BLAST
Go to the tara-data/make_db_wimg/databases_aln/tara_databases/db_blast_nucl directory
Copy the files fasta XXX.fna and ref.fna.
Create the db blast command lines:
```console
sudo formatdb -p F -i XXX.fna -o T -t XXX.fna -s -v 8000
sudo formatdb -p F -i XXX_ref.fna -o T -t XXX_ref.fna -s -v 8000
```
Delete the fasta XXX.fna and ref.fna files.
- VSearch
VERNETTE Caroline's avatar
VERNETTE Caroline committed
369
Go to the tara-data/make_db_wimg/databases_aln/tara_databases/db_vsearch directory
VERNETTE Caroline's avatar
VERNETTE Caroline committed
370
371
Create symbolic links:
```console
VERNETTE Caroline's avatar
VERNETTE Caroline committed
372
373
sudo ln -s ../../../../db_XXX/XXX_ref.fna XXX_ref.fna
sudo ln -s ../../../../db_XXX/XXX.fna  XXX.fna
VERNETTE Caroline's avatar
VERNETTE Caroline committed
374
375
```
### Json files
VERNETTE Caroline's avatar
VERNETTE Caroline committed
376
The [taxonJson.pl](https://gitlab.osupytheas.fr/ocean_atlas/oba/-/blob/master/doc/taxonJson.pl) script in pj creates them (change the name of the output file in the first case to XXX_taxotree.json, XXX_taxotree_ref.json in the second).  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
377
378
379
380
./taxonJson.pl taxon_dif  
./taxonJson.pl ref_dif  

## DATABASE
VERNETTE Caroline's avatar
VERNETTE Caroline committed
381
382
383
384
385
386
On the tara server go to the db_XXX folder which contains all the files and connect to the database:
```
mysql -u root -p --local-infile tobig
```
A password is requested (I do not put it here for security reasons).
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
387
CREATE TABLE XXX_gene_abundance LIKE 18SV9v3_gene_abundance;
VERNETTE Caroline's avatar
VERNETTE Caroline committed
388
LOAD DATA LOCAL INFILE 'XXX_gene_abundance.tsv' INTO TABLE XXX_gene_abundance FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
VERNETTE Caroline's avatar
VERNETTE Caroline committed
389
CREATE TABLE XXX_gene LIKE 18SV9v3_gene;
VERNETTE Caroline's avatar
VERNETTE Caroline committed
390
LOAD DATA LOCAL INFILE 'XXX_gene.tsv' INTO TABLE XXX_gene FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
VERNETTE Caroline's avatar
VERNETTE Caroline committed
391
CREATE TABLE XXX_metadata_definitions LIKE 18SV9v3_metadata_definitions;
VERNETTE Caroline's avatar
VERNETTE Caroline committed
392
LOAD DATA LOCAL INFILE 'XXX_metadata_definitions.tsv' INTO TABLE XXX_metadata_definitions FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
VERNETTE Caroline's avatar
VERNETTE Caroline committed
393
CREATE TABLE XXX_sample_metadata LIKE 18SV9v3_sample_metadata;
VERNETTE Caroline's avatar
VERNETTE Caroline committed
394
LOAD DATA LOCAL INFILE 'XXX_sample_metadata.tsv' INTO TABLE XXX_sample_metadata FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
VERNETTE Caroline's avatar
VERNETTE Caroline committed
395
CREATE TABLE XXX_sample LIKE 18SV9v3_sample;
VERNETTE Caroline's avatar
VERNETTE Caroline committed
396
LOAD DATA LOCAL INFILE 'XXX_sample.tsv' INTO TABLE XXX_sample FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
VERNETTE Caroline's avatar
VERNETTE Caroline committed
397
CREATE TABLE XXX_taxonomy LIKE 18SV9v3_taxonomy;
VERNETTE Caroline's avatar
VERNETTE Caroline committed
398
LOAD DATA LOCAL INFILE 'XXX_taxon.tsv' INTO TABLE  XXX_taxonomy FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
VERNETTE Caroline's avatar
VERNETTE Caroline committed
399
CREATE TABLE XXX_reftaxon LIKE 18SV9v3_reftaxon;
VERNETTE Caroline's avatar
VERNETTE Caroline committed
400
401
402
403
404
405
LOAD DATA LOCAL INFILE 'XXX_reftaxon.tsv' INTO TABLE  XXX_reftaxon FIELDS TERMINATED BY '\t' LINES TERMINATED BY '\n';
```
For each table the structure is copied (create table like) and filled (load data).  
⚠️ Be careful for each command line to check if all the lines of the file have been integrated, if Deleted: 0 and if there are warnings
`show warnings;`
lets know what's wrong.
VERNETTE Caroline's avatar
VERNETTE Caroline committed
406
ℹ️ A database schema is available [here](https://gitlab.osupytheas.fr/ocean_atlas/oba/-/blob/master/doc/schema_OBA_tobig.svg)
VERNETTE Caroline's avatar
VERNETTE Caroline committed
407

VERNETTE Caroline's avatar
VERNETTE Caroline committed
408
409
This command line allows you to know the size fractions present in the dataset in order to be able to define colors in laravel.
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
410
select *,count(*) from XXX_sample group by upper_fraction,lower_fraction;
VERNETTE Caroline's avatar
VERNETTE Caroline committed
411
412
```

VERNETTE Caroline's avatar
VERNETTE Caroline committed
413
## Integration in laravel
VERNETTE Caroline's avatar
VERNETTE Caroline committed
414
- Add the XXX_sum_abundance.json file in public/sum_station_abundance.
VERNETTE Caroline's avatar
VERNETTE Caroline committed
415
- Add XXX_taxotree.json and XXX_taxotree_ref.json files in public/build/json.  
VERNETTE Caroline's avatar
VERNETTE Caroline committed
416

VERNETTE Caroline's avatar
VERNETTE Caroline committed
417
418
In wimg_laravel_oba_dev edit files : config/wimg.php & config/database.php.  
### wimg.php
VERNETTE Caroline's avatar
VERNETTE Caroline committed
419
- In the barcode array (around line 167) add:
VERNETTE Caroline's avatar
VERNETTE Caroline committed
420
421
422
423
424
425
426
427
428
429
430
431
```
                'XXX'=>array(
                    'title' => 'blabla',
                    'id' => 'XXX_',
                    'color_SF' => array(
                        "[0.8-20µm]"=>"#e904ed",
                        "[5-20µm]"=> "#ffff19", 
                        "[20-180µm]"=> "#82020b", 
                        "[180-2000µm]"=> "#ff7200", 
                        "[>0.8µm]"=> "#07cdf9",
                        "[>3µm]" => "#07b52f",
                    ),
VERNETTE Caroline's avatar
VERNETTE Caroline committed
432
433
                    'primer-forward' => "add_primer-forward",
                    'primer-reverse' => "add_primer-reverse",
VERNETTE Caroline's avatar
VERNETTE Caroline committed
434
435
                ),
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
436
The size fractions correspond to those obtained previously with `select * from XXX_sample group by upper_fraction,lower_fraction;`               
VERNETTE Caroline's avatar
VERNETTE Caroline committed
437
- In the db_path array (around line 327) add:
VERNETTE Caroline's avatar
VERNETTE Caroline committed
438
439
440
441
442
443
444
445
446
447
```
		'XXX' => array(
			  'blast_nucl' => '../storage/sequence_databases/db_blast_nucl/XXX.fna',
			  'vsearch' => '../storage/sequence_databases/db_vsearch/XXX.fna',
			  'vsearch_ref' => '../storage/sequence_databases/db_vsearch/XXXref.fna',
			  'blast_ref' => '../storage/sequence_databases/db_blast_nucl/XXXref.fna',
			  'normalization_file'=>'sum_station_abundance/XXX_sum_abundance.json',
			  'protIs6FrameTrans' => 0, 
	         ),
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
### database.php
In the connections array (around line 34) add:
```
        'XXX' => [
            'driver' => 'mysql',
            'host' => env('DB_HOST', '127.0.0.1'),
            'port' => env('DB_PORT', '3306'),
            'database' => env('DB_DATA_NAME','laravel'),
            'username' => env('DB_DATA_USER','root'),
            'password' => env('DB_DATA_PWD','root'),
            'unix_socket' => env('DB_SOCKET', ''),
            'charset' => 'utf8mb4',
            'collation' => 'utf8mb4_unicode_ci',
            'prefix' => 'XXX_',
            'strict' => true,
            'engine' => null,
        ],
```        
VERNETTE Caroline's avatar
VERNETTE Caroline committed
466
467
468
469
470
471
In wimg_laravel_oba_dev do:
```
sudo php artisan config:cache
sudo php artisan view:clear
sudo php artisan cache:clear
```
VERNETTE Caroline's avatar
VERNETTE Caroline committed
472
This allows laravel to take changes into account.
VERNETTE Caroline's avatar
VERNETTE Caroline committed
473

VERNETTE Caroline's avatar
VERNETTE Caroline committed
474
# Barcode catalogs
VERNETTE Caroline's avatar
VERNETTE Caroline committed
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493

The Tara Oceans 18S-V9 rRNA metabarcode data set consists of eight size-fractionated communities obtained from two depths in the photic zone (subsurface [SRF], deep-chlorophyll maximum [DCM]), one from the mesopelagic zone (MES) and one from the marine epipelagic mixed layer (MIX). Size fractionations corresponded to filter collected pico- and nanoplankton (0.8–5 μm), and plankton net tows for the nano-, micro-, and mesoplankton (5–20 μm, 20–180 μm and 180–2,000 μm, respectively) (http://taraoceans.sb-roscoff.fr/EukDiv/; de Vargas et al., 2015). This data set was built by sequencing plankton metabarcodes and assembling 1,685,214,722 raw reads, from 1,046 samples including Tara Oceans Polar Circle expedition (https://figshare.com/s/cfbf869ca84310fda6bb; Ibarbalz et al., 2019). Metabarcodes were clustered into biologically meaningful 474,303 OTUs using the “Swarm” approach (Mahé et al., 2014). For the taxonomic assignment of metabarcodes, the Protist Ribosomal Reference -PR2- database was used (Guillou et al., 2013).

18SV9v3:  
reference: EukRibo V1 https://doi.org/10.5281/zenodo.6327891  
18SV9v2 or v3 in OBA https://zenodo.org/record/3768510#.Yyg14dXP0UE  

18SV4:  
https://doi.org/10.5281/zenodo.7235995  
ref: pr2_version_4.11.1_UniEuk_V4_unique.tsv.gz  



The Tara Oceans 16S/18S rRNA miTags data set consists of two size-fractionated communities (0.22–1.6 μm and 0.22–3 μm) that were obtained from two depths in the photic zone (SRF and DCM), as well as one depth is the mesopelagic zone (MES) and one in the marine epipelagic mixed layer (MIX). The metagenomics reads corresponding to both size fractions (enriched in prokaryotes and giant viruses) described in (Salazar et al., 2019) are available at https://www.ocean-microbiome.org and https://zenodo.org/record/3473199. For each prokaryote-enriched sample (N = 180), merged 19,037,038 raw reads Illumina reads (miTags) that contained signatures of the 16S/18S rRNA gene were extracted (Logares et al., 2014). These fragments were mapped to a set of 16S/18S reference sequences that were downloaded from the SILVA database (Release 128: SSU Ref NR 99; https://www.arb-silva.de/fileadmin/silva_databases/release_128/Exports/SILVA_128_SSURef_Nr99_tax_silva.fasta.gz). A total of 23,987 miTags sequences were annotated. Abundance tables were built by counting the number of miTags assigned to each taxa in each sample and the number of unassigned miTags (https://www.ebi.ac.uk/biostudies/files/S-BSST297/u/OM-RGC_v2_taxonomic_profiles.tar.gz).

The 16S-V4 V5 metabarcode data set from the Malaspina-2010 expedition was built from 60 samples of bathypelagic (BAT: 1,000–4,000 m) and abyssopelagic (ABY: 4,000–6,000 m) waters (Salazar et al., 2015) (https://github.com/GuillemSalazar/MolEcol_2015). This metabarcode data set based on 1,789,427 raw reads contained 3,902 OTU sequences for two plankton size fractions (0.2–0.8 μm and 0.8–20 μm). The taxonomic assignment was performed using the SILVA database (release 115; https://www.arb-silva.de/fileadmin/silva_databases/release_115/Exports/SSURef_NR99_115_tax_silva.fasta.tgz). Abundance tables contained the number of reads for the OTUs of particle-attached (PA) and free-living (FL) prokaryotes detected in 30 globally distributed sampling stations (https://github.com/GuillemSalazar/MolEcol_2015/blob/master/OTUtable_Salazar_etal_2015_Molecol_norarefac.txt).