Which VCF to use for germline calls (neuroblastoma)

Hello!
I want to make sure I’m using the portal right. I basically want to get all the relevant processed data for a given patient ie PT_AJ04JZ0D … I’m not interested in improving upon the variant calls, so I just want the most processed/filtered results. I see there are a whole bunch of files listed as belonging to that patient’s tumor sample:

0ddebe86-3b95-4ec4-887a-4044246851a0.CGP.filtered.deNovo.vep.vcf.gz

44fe19a0-9c60-4c68-b43a-198f4cb1e09d.mutect2_somatic.PASS.vep.vcf.gz

44fe19a0-9c60-4c68-b43a-198f4cb1e09d.strelka2_somatic.PASS.vep.vcf.gz

6908505d-48cb-4fe4-9464-40bcfa9e75a5.vardict_somatic.PASS.vep.vcf.gz

7cac2f31-8f45-407f-8bac-aca793fb8af3.lancet_somatic.PASS.vep.vcf.gz

8bc65c6a-3878-4f44-84ee-440e56fdb940.consensus_somatic.PASS.vep.vcf.gz

SL266438.hard-filtered.vcf.gz

SL332889.hard-filtered.vcf.gz

Where can I get info about what these are and which to use? I’m guessing that the one with consensus_somatic in its name is the one to use (and/or the corresponding maf).

And for the germline, I should just use the one called SL267368.hard-filtered.vcf.gz (annotated as “normal” tissue type)? Or should I use the gVCF file? I’d prefer not to have to download the gVCF files since they are so big.

Thank you!
Rachel

Hi Rachel,
Thanks for your question! I wouldn’t use the SL-prefixed hard filtered files, unless you were interested in using those that came from the sequencing center. They may not be the same reference genome that we harmonize with, so I’d advise against it. The bad news is that for the moment, you are stuck with the gVCF. The good is the following:

  1. We have a workflow that is suitable for creating slngle sample calls fitting your criteria. We anticipate a public release within the next couple days.
  2. Better news, if you’re patient: we anticipate within about a month’s time running that workflow, or an equivalent and registering those files into the portal so that this issue is a thing of the past!

I will update with when the new workflow is available, and anticipate that other KF members will announce the arrival of these new files.

Thanks,
Miguel

1 Like

Just a quick update on this. As promised, the first version of our GATK single sample genotyping and variant calling workflow can be found here. No updates yet on when a KFDRC-processed versions of these files will be available.

1 Like