Luc Camoin (Marseille / FR), Samuel Granjeaud (Marseille / FR), Marc Antoine Gerault (Marseille / FR)
Data-independent acquisition (DIA) proteomics is a recently-developed global mass spectrometry (MS)-based proteomics strategy. In DIA method, precursor ions are isolated into pre-defined isolation windows and then fragmented; unlike in data-dependent acquisition (DDA) where they are isolated for a specific m/z. Fragmented ions in each window are then analyzed by a high-resolution mass spectrometer. DIA technology has been increasingly utilized in various proteomics studies since it offers a broad protein coverage, high reproducibility, and accuracy. The analysis of these chimeric MS2 spectra is now facilitated thanks to AI. Many software dedicated to DIA are now available including DIA-NN which is increasingly popular in the proteomics community [1]. However, it is strongly advised to filter the DIA-NN output under R to get better quantification. For this, users must be comfortable with the R language. In addition, for the moment DIA-NN does not offer the possibility to get a TOP3 or iBAQ quantification. To overcome these problems, we offer DIAgui an R package based on Demichev's diann-rpackage that contains a user-friendly interface to process the output of DIA-NN.
DIAgui contains two main functions: report_process which allows to process your report file output from DIA-NN with one R command and runDIAgui which launch the shiny app to process your file in an interactive way. After loading the report file, you can change the names of your fractions which are by default the path to the raw files used. Then, you can choose to extract the precursor, peptide, protein group or gene datasets from your report. You will have the possibility to filter according to some q values, to keep only proteotypic and to eliminate or not the modified peptides. For the protein group file, the Max-LFQ algorithm will be used to quantify proteins. You have two options: use the diann-rpackage or iq package method. These two methods are completely equivalent but the algorithm from iq package is much faster. You can also choose to get the Top3 and iBAQ quantification. For iBAQ, you can either load a FASTA file or use seqinr package which will make a query to SwissProt database. Use a FASTA file is way faster since the shiny app doesn"t have to make a query for each protein. For the other datasets, the app uses the function diann_matrix which is based on the function of the same name from diann-rpackage. Nevertheless, we added the features offers by the shiny app which are the possibility to obtain the Top3 quantification and the number of peptides used for the quantification (for the gene centred dataset), and also to take the sum or the max of the intensities of same ID. In the last tab of the app, you can visualize your data with an interactive heatmap, a density and MDS plot or others, and this for each your dataset (either the ones obtained in the app or one you uploaded).