SARS-CoV-2 has spread rapidly around the world, with Brazil currently considered an epicenter of the pandemic. The Northern region has the second highest incidence coefficient, as well as the third highest mortality rate in the country. This study aimed to investigate information about the evolutionary history of epidemic spread and genetic aspects of strains isolated on the Western Amazon, in the State of Rondônia, Brazil. It was possible to detect a total of 22 mutations. Some of these alterations may possibly be related to effects on transmissibility, the fidelity of RNA replication, the ability of cancer patients to respond to infection, beyond a mutation that emerged after the introduction of SARS-CoV-2 in Rondônia. At least two events of introduction were detected, corresponding to the B.1 and B.1.1 European lineages. An introduction was observed possibly through Argentina, where strains originated that circulated in the Minas Gerais and Ceará Brazilian states, prior to Rondônia (B.1.), as well as through the Minas Gerais state and the Federal District, which gave rise to strains that spread to Rondônia, from the capital to more rural parts of the state (B.1.1.). The findings show the need to monitor the genetic epidemiology of COVID-19, in order to surveil the virus's evolution, dispersion and diversity.
These authors contributed equally: Luan Felipo Botelho-Souza, Felipe Souza Nogueira-Lima and Tárcio Peixoto Roca.
An outbreak of a serious respiratory disease of unknown etiology emerged in December 2019 in Wuhan, China. In early 2020, epidemiological and genetic analyses allowed for the identification of a new Coronavirus, later named Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) which causes a clinical presentation now defined as Coronavirus Disease of 2019/COVID-19[
SARS-CoV-2 has a large and complex positive chain RNA genome of approximately 30,000 nucleotides in length that encodes known and hypothetical proteins[
Since the outbreak began, SARS-CoV-2 has spread rapidly around the world. The virus has a high potential for transmissibility, with a reproductive number (R0) that can vary between 2.0 and 4.0[
Due to this ease and high rate of transmissibility, through January 07, 2021, COVID-19 has already been responsible for more than 1,890,000 deaths worldwide, with Brazil accounting for about 10.5% of this total (
Extensive sequencing of the viral genome from different regions of the country provides insights into the prevalence of viral strains and any regional differences that may lead to a better understanding of patterns of transmission, outbreak tracking and, therefore, facilitate containment measures formulation. This study presents genetic data of the first 08 sequences of SARS-CoV-2 isolates in the state of Rondônia. The information available on the main mutations detected was reviewed in order to provide current and important details for the development of vaccines, specific antivirals and effective diagnostic tests. Additionally, the phylodynamic relationships between samples and sequences of isolates from different locations were also studied to assess the epidemic and evolutionary history of the virus in this State.
The study was carried out by the Molecular Virology Laboratory of the Oswaldo Cruz Foundation of Rondônia (FIOCRUZ-RO) in collaboration with the Central Laboratory of Public Health of Rondônia (LACEN/RO) and the Leônidas and Maria Deane Institute (ILMD) of FIOCRUZ Amazonas. Ten samples of combined swabs collected from individuals residing in the State of Rondônia/Brazil with clinical symptoms of COVID-19 had detectable viral RNA following the RTq-PCR protocol from the Center of Disease Control and Prevention (CDC, Atlanta, USA). All samples were sequenced following the protocol described by Nascimento et al.[
Patients were informed in detail about the study and written consent was given by all participants. All clinical procedures and experiments were performed in accordance with international and national guidelines. The study was approved by the Research Ethics Committee of the Research Center for Tropical Medicine of Rondônia (CEP/CEPEM-RO), under opinion number 4000086.
Sequences of the viral isolates were aligned with the reference sequence for SARS-CoV-2 (NC_045512), a Wuhan isolate, in MEGA7 software (Molecular Evolutionary Genetics Analysis)[
Some recent studies have identified large groups of strains of SARS-CoV-2. Forster et al.[
The previous classification of SARS-CoV-2 strains isolated in the state of Rondônia/Brazil in SS groups had two main objectives: (
In this context, Pipes et al.[
Due to low homology at the extremities, the following analyses were performed based on the region of the genome that extends from position 55 to 29 838 (positions determined based on the reference sequence NC_045512). The alignment was used as an inference input for a non-clock tree based on Maximum Likelihood using the software IQtree v.1.6.12[
The molecular clock method was used for Bayesian inference of the phylogeny of evolutionary groups. The Lognormal and Exponential distributions of the Relaxed Non-Correlated Clock were tested. The replacement model used was the second best according to the estimate previously described, TN93 + F + I (Tamura-Nei with empirical base frequency and heterogeneity of invariant sites), since the TIM model is not among those supported in the BEAST v.1.10.4 package (SUCHARD et al., 2018). Phylogeny was calculated using the Coalescent: Exponential Growth model, through the Monte Carlo Markov Chain (MCMC), whose length was 1 × 10
Once the SS group of the strains was determined, the analyses continued to elucidate the detailed evolutionary history, through phylogenetic inference of the molecular clock with specific representatives of the group. Based on the results of the analysis of mutations and of the evolutionary group, it was observed that the study group would be SS4. Therefore, all SS4 representatives that were identified by Yang and collaborators (2020) were included in this second analysis.
Additionally, with exceptional help from the interactive panel of the GISAID platform, it was observed that the GISAID clades 20A, 20B and 20C comprise isolates that have the signature mutations of the SS4 group and, consequently, all sequences belonging to these clades were collected, using June 26th as the deadline. This collection was directed exclusively at strains isolated in South American countries, with Brazil being the only country whose sequences were collected at the State level. The use of this geographical filter at this stage of the study is justified due to the issue of urban mobility, a factor that may have been a facilitator and differential in the introduction of SARS-CoV-2 in the State of Rondônia.
Uncertainty and lack of resolution have been described regarding the phylogenies of SARS-CoV-2, due to the relatively small genetic diversity that has been accumulated during the short time of the outbreak[
This phylogenetic inference followed a methodology similar to that used to determine evolutionary groups. In summary, a non-clock tree based on Maximum Likelihood was built using IQtree v.1.6.12 software under the previously described parameters. This, in turn, was used to visualize the existence of a temporal signal in the data set using the software Tempest v.1.5. Once a time signal was detected, the alignment was then imported into software from the BEAST package to perform a molecular clock approach. In this case, the Lognormal and Exponential distributions of the Non-Correlated Relaxed Clock were also tested. The replacement model used was GTR + F + I. The adjusted length of the MCMC for convergence of the parameters was 3 × 10
Of the ten samples from patients with a confirmed diagnosis for COVID-19, 8 were successful in the new generation sequencing procedure, generating complete genomic sequences with mean coverage level > 99%. The analysis of mutations in the genome of these strains demonstrated the presence of a total of 22 alterations in different sites, one of which was found in a non-coding region. Among those found in coding regions, 12 are classified as non-synonymous mutations. They are found in 7 viral proteins: nsp1, nsp12, Spike, ORF3a, ORF6, ORF8 and Nucleoprotein. The complete list of mutations found, and the percent frequency of occurrence is shown in Table 1. Percentage data on the frequency of mutations among the population were obtained from a comparison with isolates deposited in GISAID and collected on the timepoint of June 02, 2020.
Table 1 Mutations found in strains of SARS-CoV-2 isolated in Rondônia.
Mutation Place of occurrence Type of mutation Proteic alteration Freq. between study isolates Freq. between population Gene Protein (position at the gene level) Brazil Worldwide 5′ UTR – – – 100% 100% 88% C364A ORF1ab nsp1 Nonsynonymous D33E 25% 0% 0% ORF1ab nsp3 Synonymous – 100% 100% 88% C11563T ORF1ab nsp7 Synonymous – 12.50% 0% 0% C14265T ORF1ab nsp12 RdRp Synonymous – 12.50% 0% 0% ORF1ab nsp12 RdRp Nonsynonymous P4715L 100% 100% 88% C15324T ORF1ab nsp12 RdRp Synonymous – 25% 0% 5% C16428T ORF1ab nsp13 Helicase Synonymous – 25% 0% 0% T22156C S Spike/surface Synonymous – 12.50% 0% 0% C23244A S Spike/surface Nonsynonymous P561H 25% 0% 0% S Spike/surface Nonsynonymous D614G 100% 100% 87% C23917T S Spike/surface Synonymous – 12.50% 0% 0% T25036C S Spike/surface Synonymous – 12.50% 0% 0% G25855T ORF3a Nonsynonymous D155Y 25% 0% 0% A26045G ORF3a Nonsynonymous Q218R 25% 0% 0% T27299C ORF6 Nonsynonymous I33T 75% 2% 4% A28108C ORF8 Nonsynonymous Q72P 12.50% 0% 0% G28881A N Nucleoprotein Nonsynonymous R203K 75% 100% 43% G28882A N Nucleoprotein Synonymous - 75% 100% 43% G28883C N Nucleoprotein Nonsynonymous G204R 75% 100% 43% T29148C N Nucleoprotein Nonsynonymous I292T 75% 2% 4% C29367T N Nucleoprotein Nonsynonymous P365L 37.50% 0% 0%
Four mutations were found in 100% of the isolates: C241T, C3037T, C14408T and A23408G. With the exception of alteration C14408T, the others were classified as signature mutations for super spreader group 4 (SS4) identified by Yang and collaborators (2020). Therefore, according to this information, all samples from the present study that were isolated belong to the SS4 group. Phylogenetic analysis to determine evolutionary groups (Fig. 2) confirms this classification.
The vast majority of the mutations found have no clinical/virological significance described in the literature, with some being considered unique. Among the known mutations in the ORF1ab gene, C14408T was found in 100% of the samples, which results in the replacement of a proline amino acid with a leucine at position 323 (P323L) of nsp12 RdRp (RNA-dependent RNA polymerase). Alterations in viral enzymes of this nature raise a level of concern, since they can cause resistance to drugs that have RdRp as a target, as previously described for hepatitis C, Influenza and also for one Coronavirus infection in mice treated with Rendesivir[
However, the P323L alteration results in an amino acid with an isoelectric point similar to the wild type amino acid[
In addition, it was recently proposed in a pre-printed study with over 11,200 sequences that this alteration may be associated with an increase in the rate of viral mutations[
Among the modifications found in the S gene, A23403G stands out. This non-synonymous mutation was found in 100% of the samples and results in the replacement of an aspartate amino acid with a glycine at position 614 (D614G) of the Spike protein. This protein, through its receptor binding domain (RBD), mediates the interaction of the virus with the host cell by binding to ACE-2, which consequently facilitates membrane fusion and viral penetration[
Other reasons have also been pointed out to justify this association: (
Considering that this mutation was first identified on January 28, 2020 (Germany/BavPat1/2020|EPI_ISL_406862), this increase in frequency was also found in a direct genomic analysis of all 1539 SARS-CoV-2 genomes deposited in the GISAID platform between February 29th and March 26th. There was a prevalence of 56% of isolates belonging to the SS4 group, which hosts the D614G mutation as a signature characteristic of the group, showing the rapid dissemination of this variant over time[
A recently published in vitro study that is currently in prepress performed comparisons of the functional properties of the D614 and G614 variations of the Spike protein, finding greater efficiency of infectivity with the G614 variant in the replication of pseudotyped retroviruses in cells that express ACE-2. The improvement was associated with a possible marked incorporation of Spike protein into the final structure of the virus, which may therefore improve the transmission of SARS-CoV-2 between different hosts[
A more current and comprehensive study brought new and relevant information to this discussion, by showing that the G614 variant has become dominant at the global level, in a way that suggests that such variant is undergoing positive selection. Additionally, it appears to be associated with a higher viral load in samples of the human upper respiratory tract and with greater infectivity in pseudotyping assays[
Clinically, D614G does not appear to be associated with the severity of the disease[
Two changes in different genes were found together in 75% of the samples: T27299C and T29148C. Both are classified as non-synonymous mutations that result in the substitution of an isoleucine amino acid with a threonine at positions 33 (I33T) and 292 (I292T) of the viral proteins ORF6 and N, respectively. According to the study by Candido et al.[
Other changes were also found in the N gene of the strains analyzed. Three sequential nucleotide changes are highlighted: G28881A, G28882A and G28883C, which were found together in 75% of the samples. They result in the replacement of two amino acids of the viral Nucleoprotein, R203K and G204R (R—arginine; K—lysine; G—glycine). The potential effect of these mutations on viral and host processes has been investigated, and it has been observed that they result in considerable changes in the predicted binding with some miRNAs, which may play a role in influencing the progress of the infection. Some of the miRNAs that bind to this mutated type of nucleoprotein may be under-regulated in several types of cancer. This increases the possibility that cancer patients may have a high susceptibility to the mutated variant due to a reduced ability to contain the virus, compared to the wild-type infection[
Another alteration detected in the same gene is C29367T, found in three (37.5%) of the eight samples in the study. It is a nonsynonymous mutation that results in a P365L substitution. This mutation has not yet been described in the scientific literature. When looking for other sequences with this mutation in Dataset B, it was observed that none of the strains included show this change. This leads to the assumption that it has appeared more recently in the viral evolutionary history of SARS-CoV-2. Because it was detected only in some sequences in Rondônia, it may have appeared after the virus entered the state and can be used as a marker to study viral spread among different municipalities in the state.
For both Dataset A and Dataset B, it was possible to observe a linear regression curve that shows a positive correlation between genetic diversity and sampling time, showing the existence of sufficient time signal in the data sets to justify a molecular clock approach (Fig. 1, A and B). Although the time signal level may be considered low for Dataset B, as evidenced by the R
Graph: Figure 1 Linear regression graphs of temporal signal detection. The graphs show the positive correlation between genetic diversity from root to tip (y-axis) and the sampling time of the included sequences (x-axis). This effect on the relationship of these variables shows the existence of a temporal signal in the analyzed data set, which makes it sufficient for molecular clock analysis. Graphs A and B refer to the analyses in datasets A and B, respectively. The value of R2 is shown in the upper left corner of the corresponding graph.
Phylogenetic analysis to determine evolutionary groups of SARS-CoV-2 strains in the present study, using dataset A, confirmed one of the conclusions previously obtained with the study of mutations: all samples were identified as belonging to the SS4 group, with a posterior support of the 100% cladistic distribution (Fig. 2). The best distribution of the non-correlated relaxed clock was "exponential", chosen through the analysis of convergence of MCMC run parameters and tree topology.
Graph: Figure 2 Bayesian phylogenetic analysis to determine evolutionary group. In the generated MCC tree, the phylogenetic relationship was estimated from 49 SARS-CoV-2 sequences included in dataset A. The red taxa correspond to the SS1 group; groups SS2 and SS3 are in green, mixed as previously proposed and; the SS4 group is in blue. The study samples are colored black, as well as the sequence used for rooting the inferred tree. In each node, the subsequent probability rate for supporting the branches in decimal data is shown. The time for the most recent common ancestor (tMRCA) among all variations of SARS-CoV-2 was dated to October 23, 2019 (95% Highest Posterior Density—HPD: July 29 and December 17, 2019), similar to other studies[
According to the evolutionary history of the SS groups pointed out by Yang et al. (2020), strains descended from the original virus were transmitted to various locations in the world and were dominant for a period of time, during the early outbreak of COVID-19. However, with continuous transmission in different environments, the virus has evolved into four large super-spreading clusters, along with other variants derived directly from the original virus. SS group members became dominant, with different variants prevailing in different regions of the world, in mid-February and March.
The SS1 strains first emerged and were transmitted mainly in Asia, South Korea and the USA. They persisted in China during the post-initial outbreak phase, being less prevalent in other parts of the world. Groups SS2 and SS3 were transmitted mainly in mid-January and February, in Asian countries other than China, as well as Europe and Brazil, specifically in the State of São Paulo during the initial phase of the outbreak. Finally, group SS4 emerged in late January and was reported for the first time in Germany. It was primarily responsible for the outbreak of a pandemic on the European continent, replacing the previous dominance of strains SS2 and SS3 in the region. From this continent, this variant has spread to several other locations around the world, as already discussed in relation to the D614G substitution. It also arrived in South America where, in mid-March, it entered the State of Rondônia.
This analysis allowed us to observe that at least two different events of entry occurred in the State, both of European descent. It also showed a deficiency of phylogenetic signal to differentiate strains from groups SS2 and SS3. In fact, for identification through direct genomic observation, both groups have only one signature mutation each (G26144T for SS2 and G11083T for SS3), which may show little phylogenetically useful difference for differentiating strains from these groups, when considering the integral size of the SARS-CoV-2 genome and its biological tendency to maintain conservation. In addition, we observed some Brazilian strains deposited in Genbank (MT126808.1 and MT350282.1) that have both of the aforementioned substitutions. Therefore, we suggest the union of groups SS2 and SS3 in the classification of super spreaders. Fortunately, this question does not negatively influence the determination of the samples as descendants of the SS4 group.
Phylogenetic analysis to detail the evolutionary history of the SARS-CoV-2 strains from the present study was performed based on the relaxed correlated molecular clock model using dataset B. The "lognormal" distribution was chosen through the convergence analysis of MCMC run parameters and tree topology. The inferred tree allowed us to observe that 75% of the strains isolated in the State of Rondônia belong to pangolin lineage B.1.1.; while the remaining 25% belong to line B.1. (Fig. 3A,B). This classification was supported by 100% of subsequent probability in determining the lineage at some cladistic level of the tree and provides support for the previous conclusion of the occurrence of at least two SARS-CoV-2 entry events in the State.
Graph: Figure 3 Bayesian phylogenetic tree for detailing the evolutionary path. In the generated MCC tree, the phylogenetic relationship was estimated based on 307 SARS-CoV-2 sequences included in dataset B. (a) The taxa and clades colored in blue correspond to strains belonging to variations of pangolin lineage B.1.1., including: B.1.1., B.1.1.1., B.1.1.10. and B.1.1.9. (b) The green colored taxa and clades correspond to strains belonging to other variations of pangolin lineage B.1., Including: B.1., B.1.3., B.1.5., B.1.5.4., B. 1.67. and B.1.8. The study samples are colored black, along with the sequence used for rooting the inferred tree. In each node, the subsequent probability rate for supporting the branches in decimal data is shown. The tMRCA among all variations of SARS-CoV-2 was dated to November 20, 2019 (95% HPD between October 17 and December 20, 2019), similar to other studies[
In order to avoid inaccurate conclusions regarding the introduction of SARS-CoV-2 in the State, details of the evolutionary path were obtained from information on clades that included samples with posterior support greater than 85%. Therefore, considering this criterion, it was not possible to fully detail the evolutionary history of the introduction of the B.1.1 strain. In addition to being of European descent, B.1.1. strains from the state of Rondônia also descend from an ancestral strain that circulated in Argentina around the transition from February to March, with a differentiation date of February 25th (95% HPD between February 14th and 29th (Fig. 3A). It was not possible to draw any further conclusions about the detailed path between the transmission from Argentina to Rondônia, nor whether it occurred directly between these localities.
However, another interpretation is also possible. This group of sequences share a common ancestor, descended from an older one (dated February 15th, with 95% HPD between January 28th and February 26th) that gave rise to isolated strains in the middle of March in the state of Minas Gerais and the Federal District. Therefore, it is possible that strains circulating in these states have spread to Argentina and Rondônia. A previous study identified the transmission of B.1.1. strains to some South American countries, including Argentina[
The detailing of the evolutionary path regarding the introduction of the B.1. line provided more detailed information about this process. Just like for B.1.1., B.1. strains also share ancestry with a parental strain that circulated in Argentina, having differentiated from a common ancestor on February 29th (95% HPD between February 26th and March 15th). Another more recently shared common ancestor gave rise to strains that circulated in the Brazilian states of Minas Gerais and Ceará, dated March 9th (95% HPD between March 8th and 21st) (Fig. 3B). This last dating does not represent the exact period of arrival of this lineage in the State, but a period close to this event. It should be noted that the first confirmed case in the state of Rondônia occurred on March 20th.
Three pairs of samples are lined up in the analysis in a monophyletic manner with 100% posterior support, showing a very high degree of similarity between them. This shows the expected effect of sustained community transmission of the virus in the state. The monophyletic relationship of B.1.1. strains of the sample pairs 01–03 and 07–08 may provide relevant information about the viral dissemination profile in the State. With the exception of sample 07, the others have the aforementioned C29367T alteration in their genome, which presumably arose after the introduction of SARS-CoV-2 in the State and which can be used as a source of information to study the form of dissemination. Therefore, it is presumed that this alteration occurred after passing, not necessarily directly, from 07 to 08 in the city of Porto Velho (place of residence of their respective carriers). Subsequently, there was a continuation of the transmission of strains that carry this mutation before reaching the list of samples 01–03. Since sample 03 was isolated from a patient residing in Porto Velho, and sample 01 was isolated from a patient residing in the municipality of Jaru (about 290 km away from the capital), it is assumed that a strain was transmitted from Porto Velho to Jaru.
This study presented the genetic data of the first 08 SARS-CoV-2 sequences isolated in the state of Rondônia/Brazil, located in the southern portion of the Western Amazon. It was possible to determine at least two events of viral introduction into the state, corresponding to strains B.1. and B.1.1., around the transition from February to March 2020. In addition to both strains being of European descent, another possible introduction was observed through Argentina, passing through the Brazilian states of Minas Gerais and Ceará (B.1.), as well as from Minas Gerais and the Federal District to Argentina and Rondônia (B.1.1.).
Despite limitations resulting from the low number of samples analyzed in this study, genetic mapping allowed us to observe the presence of a total of 22 mutations. Some of these changes may possibly be related to higher transmissibility effects (A23403G/D614G/Spike glycoprotein), influence RNA replication fidelity (C14408T/P323L/nsp12 RdRp), influence the ability of cancer patients to respond to infection (G28881A, G28882A and G28883C/R203K and G204R/Nucleoprotein), in addition to a mutation (C29367T, P365L, Nucleoprotein) that emerged after the introduction of SARS-CoV-2 in the state of Rondônia, which may represent adaptation to environmental and human conditions. This information is important because it provides current and essential details for the development of vaccines, specific antivirals and effective diagnostic tests.
The findings highlight the importance of implementing a surveillance system for the genetic epidemiology of the virus in the State, which may permit the monitoring of viral evolution and dissemination in the capital and in other regions of the State through obtaining more genome sequences of the circulating strains. This can provide insights into the prevalence of viral strains and regional differences in patterns of transmission, epidemiological screening and formulation of containment measures.
This study was developed by a group of researchers from the Molecular Virology Laboratory of the Oswaldo Cruz Foundation—Rondônia, which, together with the Central Laboratory of the Health Secretariat of the Government of Rondônia (LACEN-SESAU/RO) and the Leônidas & Maria Deane Institute (ILMD/Fiocruz Amazônia), have contributed to the scientific development of the Amazon Region. Additionality, the authors would like to thank the GISAID platform, the laboratories and the respective authors that deposited SARS-CoV-2 sequences on this repository. Jointly, this initiative made it possible development studies like this. A specific table for this is available in Supplementary Material (table S1). FGN is funded by Fundação de Amparo à Pesquisa do Estado do Amazonas—FAPEAM (http://www.fapeam.am.gov.br, PCTI-EmergeSaúde/AM call No. 005/2020 and Rede Genômica de Vigilância em Saúde—REGESAM); Conselho Nacional de Desenvolvimento Científico e Tecnológico (http://www.cnpq.br, grants 440856/2016-7 and 403276/2020-9); Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (http://www.capes.gov.br, grants 88881.130825/2016-01 and 88887.130823/2016-00); Inova Fiocruz/Fundação Oswaldo Cruz (https://portal.fiocruz.br/programa-inova-fiocruz, Grant VPPCB-007-FIO-18-2-30—Geração de conhecimento).
Conceptualization and design of the study: L.F.B.-S., A.d.O.d.S., D.S.V.; Data curation: L.F.B.-S., F.S.N.-L., T.P.R.; Formal analysis: L.F.B.-S., F.S.N.-L., T.P.R.; Funding acquisition: R.d.C.P.R., C.H.N.S., A.P.D.S.G., F.F.d.M., F.R.M., J.M.V.S. and D.S.V.; Investigation: L.F.B.-S., F.S.N.-L., T.P.R.; Methodology: L.F.B.-S., F.S.N.-L., T.P.R., F.G.N., A.C.S.M., C.C.d.S., A.L.F.d.M.M., C.A.B.L., C.F.G.A., J.L.F.F., S.C.; Project administration: L.F.B.-S.; Supervision, J.M.V.C. and D.S.V.; Writing—original draft: L.F.B.-S., F.S.N.-L., T.P.R., A.d.O.d.S.; Writing—review & editing: D.B.P., J.M.V.C., D.S.V. and A.d.O.d.S.
This study was funded by the Oswaldo Cruz Foundation of Rondônia, Health Secretariat of the Rondonian State Government (LACEN-SESAU/RO) and the Leônidas & Maria Deane Institute (ILMD/Fiocruz Amazônia).
The datasets generated during the current study are available in the GISAID (Global Initiative on Sharing All Influenza Data) platform repository, under the access numbers EPI_ISL_514131 to EPI_ISL_514138. The informations about collected sequences used in this study are available on the Supplementary Material (table S1). The other data generated during the development of the study are available together in a public repository (https://doi.org/10.17632/dnh8jpz6cn.1), containing the necessary files for analyzes, specifically the alignments used, as well as the results files generated at each stage of the research.
The authors declare no competing interests.
Graph: Supplementary Legends.
Graph: Supplementary Table S1.
Graph: Supplementary Table S2.
The online version contains supplementary material available at https://doi.org/10.1038/s41598-021-83203-2.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
By Luan Felipo Botelho-Souza; Felipe Souza Nogueira-Lima; Tárcio Peixoto Roca; Felipe Gomes Naveca; Alcione de Oliveria dos Santos; Adriana Cristina Salvador Maia; Cicileia Correia da Silva; Aline Linhares Ferreira de Melo Mendonça; Celina Aparecida Bertoni Lugtenburg; Camila Flávia Gomes Azzi; Juliana Loca Furtado Fontes; Suelen Cavalcante; Rita de Cássia Pontello Rampazzo; Caio Henrique Nemeth Santos; Alice Paula Di Sabatino Guimarães; Fernando Rodrigues Máximo; Juan Miguel Villalobos-Salcedo and Deusilene Souza Vieira
Reported by Author; Author; Author; Author; Author; Author; Author; Author; Author; Author; Author; Author; Author; Author; Author; Author; Author; Author