dc.description.abstract |
Rheumatoid arthritis is an autoimmune disorder of complex disease etiology. The
serological and genetic markers of rheumatoid arthritis have not been completely
deciphered. Currently available serological diagnostic markers lack in terms of sensitivity
and specificity and thus additional biomarkers are warranted for early disease diagnosis
and management. Genome-wide association studies have enabled simultaneous
identification of single nucleotide polymorphisms associated with complex disorders such
as rheumatoid arthritis. However, polymorphisms that are unable to match the
significance threshold can be missed. Data integration can help in identification of these
variants.
Current study aimed to screen and compare the serum proteome profiles of
rheumatoid arthritis serotypes with healthy controls in the Pakistani population for
identification of potential disease biomarkers. The present work then intended to identify
novel candidate non-coding risk variants for rheumatoid arthritis using a data integration
pipeline.
Serum samples were collected from Pakistani rheumatoid arthritis patients and
healthy controls. The samples were enriched for low abundance proteins using
ProteoMinerTM columns. Patients were assigned to one of the four serotypes based on
anti‐citrullinated peptide antibodies and rheumatoid factor. Serum protein profiles were
analyzed via liquid chromatography-tandem mass spectrometry (LC-MS/MS). The
changes in the protein abundance were determined using label-free quantification
software ProgenesisQI™. Ingenuity pathway analysis was used to analyse the pathways
associated with the differentially expressed proteins. Findings were validated in an
independent cohort of patients and healthy controls using enzyme-linked immunosorbent
xix
assay. 340 significant single nucleotide polymorphisms for rheumatoid arthritis were
chosen from published genome wide association studies. SNipA proxy search tool was
used to identify single nucleotide polymorphisms linked to query polymorphisms that
were then scored using RegulomeDB, hereby named as proxy single nucleotide
polymorphisms. Single nucleotide polymorphisms with scores less than three were
annotated. Expression quantitative trait loci linked to these single nucleotide
polymorphisms were studied for protein-protein interactions using STRING database.
Single nucleotide polymorphisms linked to key proteins were further annotated using the
SNPfunc tool.
A total of 213 proteins were identified. Comparative analysis of all groups (false
discovery rate less than 0.05, greater than 2-fold change, and identified with more than 2
unique peptides) identified ten proteins that were differentially expressed between
rheumatoid arthritis serotypes and healthy controls including pregnancy zone protein,
selenoprotein P, C4b-binding protein beta chain, apolipoprotein M, N-acetylmuramoyl-L alanine amidase, catalytic chain, oncoprotein-induced transcript 3 protein,
carboxypeptidase N subunit 2, apolipoprotein C-I and apolipoprotein C-III. Pathway
analysis predicted inhibition of liver X receptor/ retinoid X receptor activation pathway
and production of nitric oxide and reactive oxygen species pathway in macrophages in all
serotypes. Protein interaction analysis identified 13 ‗hub proteins‘ expressed by the
expression quantitative trait loci linked to 54 single nucleotide polymorphisms. Of these,
nine were already reported for rheumatoid arthritis. Remaining 45 novel polymorphisms,
mapped to 11 genomic loci, are novel candidate risk variants for rheumatoid arthritis. Of
9194 proxy single nucleotide polymorphisms, 492 single nucleotide polymorphisms
returned significant RegulomeDB scores and mapped to 94 expression quantitative trait
loci.
xx
Conclusively, the current study has explored the untapped proteomics of Pakistani
rheumatoid arthritis patients and identified catalogue of serum biomarkers with diagnostic
and prognostic potential of Pakistani rheumatoid arthritis patients. These serum
biomarkers can be further tested in larger cohorts for evaluation of their diagnostic
potential. The study also used a data integration pipeline to identify the putative risk
variants for rheumatoid arthritis that might have been missed by genome wide association
studies. These missed variants can help to fill the current gaps in the knowledge of
rheumatoid arthritis‘ genetics. Further, the proposed data integration pipeline can be
incorporated into a ready-to-use computational package to accelerate the identification of
missed variants for other complex disorders. |
en_US |