Getting proximal promoter sequences for a list of genes
with Perl scripts, for use in various analyses.

Note: On your Windows machine, you will need to install ActiveState ActivePerl. On a Unix/Linux/OS X machine, you may have to alter the scripts to contain the path to Perl in the first line of the file, if Perl is not already in your path. These scripts are known to work on Windows XP, but have not been tested extensively in other environments.

Use these scripts to get the proximal promoter sequences for a list of genes contained one per line in a text file. We are defining the proximal promoter sequences for a gene as the 4500bp that span the transcription start site from -4300bp to +200bp.

The promoter retrieval tool is five individual scripts, each of which executes and either quits (if there is an error) or prompts the user to hit return to automatically begin the next script in the series. This process begins when the user makes the call

	> promoter.pl sample.txt
which will produce the file mapurls.out. If there is no error (and there should be no error with the above sample file), then the tool will prompt the user to run the next script. The list of scripts and their output is as follows:

	the call	on the file		outputs

	promoter.pl	sample.txt		mapurls.out
	prom2.pl	mapurls.out		contigurls.out
	prom3.pl	contigurls.out		contiginfo.out
	prom4.pl	contiginfo.out		promoterinfo.out
	prom5.pl	promoterinfo.out	promoterlist.out and
						 one file per gene

If you want a larger or smaller promoter sequence with regards to the transcription start site (rather than the standard -4300bp to +200bp), you can change the following variables in the file prom5.pl:

	$RANGE_BACK = 4300;
	$RANGE_FRONT = 199;
If you want to get data not based on the TSS, however, you'll have to do some programming of your own, or contact us.

Note: An error in a run of prom2.pl usually means the gene has been renamed from what you were searching for. Look for "rename" in mapurls.out file. (You will see this if you search for HSSG1, which has been renamed SFRS7.)

To get the perl scripts you can either download the entire set of scripts as a gzipped tar archive, or by right-clicking (ctrl-clicking) on any of the links above. You will then have the choice of saving the file on your local disk drive -- save the file with the ".pl" extension (e.g., prom2 as prom2.pl). You may also need to reset the executable bit for the file.


Please send bug reports to