dragonn.synthetic.synthetic module¶
-
class
dragonn.synthetic.synthetic.
AbstractApplySingleMutationFromSet
(setOfMutations, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractTransformation
Class for applying a single mutation from a set of mutations; used to transform substrings generated by another method
-
class
dragonn.synthetic.synthetic.
AbstractBackgroundGenerator
[source]¶ Bases:
object
Returns the sequence that the embeddings are subsequently inserted into.
-
class
dragonn.synthetic.synthetic.
AbstractEmbeddable
[source]¶ Bases:
object
Represents a thing which can be embedded. Note that an Embeddable + a position = an embedding.
-
canEmbed
(priorEmbeddedThings, startPos)[source]¶ priorEmbeddedThings: instance of AbstractPriorEmbeddedThings startPos: the position you are considering embedding self at returns a boolean indicating whether self can be embedded at startPos,
given the things that have already been embedded.
-
embedInBackgroundStringArr
(priorEmbeddedThings, backgroundStringArr, startPos)[source]¶ Will embed self at startPos in backgroundStringArr, and will update priorEmbeddedThings. priorEmbeddedThings: instance of AbstractPriorEmbeddedThings backgroundStringArr: an array of characters representing the background startPos: the position to embed self at
-
-
class
dragonn.synthetic.synthetic.
AbstractEmbeddableGenerator
(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixin
Generates an embeddable, usually for embedding in a background sequence.
-
class
dragonn.synthetic.synthetic.
AbstractEmbedder
(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixin
class that is used to embed things in a sequence
-
embed
(backgroundStringArr, priorEmbeddedThings, additionalInfo=None)[source]¶ backgroundStringArr: array of characters representing the background string priorEmbeddedThings: instance of AbstractPriorEmbeddedThings. additionalInfo: instance of AdditionalInfo; allows the embedder to send back info about what it did modifies: backgroundStringArr to include whatever this class has embedded
-
-
class
dragonn.synthetic.synthetic.
AbstractLoadedMotifs
(fileName, pseudocountProb=0.0, background=OrderedDict([('A', 0.27), ('C', 0.23), ('G', 0.23), ('T', 0.27)]))[source]¶ Bases:
object
A class that contains instances of pwm.PWM loaded from a file. The pwms can be accessed by name.
-
class
dragonn.synthetic.synthetic.
AbstractPositionGenerator
(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixin
Given the length of the background sequence and the length of the substring you are trying to embed, will return a start position to embed the substring at.
-
class
dragonn.synthetic.synthetic.
AbstractPriorEmbeddedThings
[source]¶ Bases:
object
class that is used to keep track of what has already been embedded in a sequence
-
addEmbedding
(startPos, what)[source]¶ embeds “what” from startPos to startPos+len(what). Creates an Embedding object
-
canEmbed
(startPos, endPos)[source]¶ returns a boolean indicating whether the region from startPos to endPos is available for embedding
-
-
class
dragonn.synthetic.synthetic.
AbstractQuantityGenerator
(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixin
class to sample according to a distribution
-
class
dragonn.synthetic.synthetic.
AbstractSequenceSetGenerator
[source]¶ Bases:
object
class that is used to return a generator for a collection of generated sequences.
-
class
dragonn.synthetic.synthetic.
AbstractSetOfMutations
(mutationsArr)[source]¶ Bases:
object
Represents a collection of pwm.Mutation objects
-
class
dragonn.synthetic.synthetic.
AbstractSingleSequenceGenerator
(namePrefix=None)[source]¶ Bases:
object
When called, generates a single sequence
-
class
dragonn.synthetic.synthetic.
AbstractSubstringGenerator
(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixin
Generates a substring, usually for embedding in a background sequence.
-
class
dragonn.synthetic.synthetic.
AbstractTransformation
(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixin
takes an array of characters, applies some transformation, returns an array of characters (may be the same (mutated) one or a different one)
-
class
dragonn.synthetic.synthetic.
AllEmbedders
(embedders, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbedder
Wrapper around a list of embedders to make sure all are called Useful in conjunciton with RandomSubsetOfEmbedders
-
class
dragonn.synthetic.synthetic.
BernoulliQuantityGenerator
(prob, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGenerator
Generates 1 or 0 according to a bernoulli distribution
-
class
dragonn.synthetic.synthetic.
BestHitPwm
(pwm, bestHitMode='pwmProb', name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGenerator
always returns the best possible match to the pwm in question when called
-
class
dragonn.synthetic.synthetic.
BestHitPwmFromLoadedMotifs
(loadedMotifs, motifName, bestHitMode='pwmProb', name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.BestHitPwm
convenience wrapper class for instantiating parent by pulling the pwm given the name from an AbstractLoadedMotifs object (it basically extracts the pwm for you)
-
class
dragonn.synthetic.synthetic.
ChooseMutationAtRandom
(setOfMutations, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractApplySingleMutationFromSet
Selects a mutation at random from self.setOfMutations to apply; see parent docs.
-
class
dragonn.synthetic.synthetic.
ChooseValueFromASet
(setOfPossibleValues, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGenerator
Randomly samples a particular value from a set of values
-
class
dragonn.synthetic.synthetic.
EmbedInABackground
(backgroundGenerator, embedders, namePrefix=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSingleSequenceGenerator
Takes a backgroundGenerator and a series of embedders. Will generate the background and then call each of the embedders in succession. Then returns the result.
-
class
dragonn.synthetic.synthetic.
EmbeddableEmbedder
(embeddableGenerator, positionGenerator=<dragonn.synthetic.synthetic.UniformPositionGenerator object>, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbedder
Embeds instances of AbstractEmbeddable within the background sequence, at a position sampled from a distribution. Only embeds at unoccupied positions
-
class
dragonn.synthetic.synthetic.
Embedding
(what, startPos)[source]¶ Bases:
object
Represents something that has been embedded in a sequence
-
class
dragonn.synthetic.synthetic.
FixedQuantityGenerator
(quantity, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGenerator
returns a fixed number every time generateQuantity is called
-
class
dragonn.synthetic.synthetic.
FixedSubstringGenerator
(fixedSubstring, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGenerator
When generateSubstring() is called, always returns the same string. The string also serves as its own description
-
class
dragonn.synthetic.synthetic.
GenerateSequenceNTimes
(singleSetGenerator, N)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSequenceSetGenerator
If you just want to use a generator of a single sequence and call it N times, use this class.
-
class
dragonn.synthetic.synthetic.
GeneratedSequence
(seqName, seq, embeddings, additionalInfo)[source]¶ Bases:
object
An object representing a sequence that has been generated.
-
class
dragonn.synthetic.synthetic.
InsideCentralBp
(centralBp, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractPositionGenerator
returns a position within the central region of a background sequence, sampled uniformly at random
-
class
dragonn.synthetic.synthetic.
LabelGenerator
(labelNames, labelsFromGeneratedSequenceFunction)[source]¶ Bases:
object
-
class
dragonn.synthetic.synthetic.
LoadedEncodeMotifs
(fileName, pseudocountProb=0.0, background=OrderedDict([('A', 0.27), ('C', 0.23), ('G', 0.23), ('T', 0.27)]))[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractLoadedMotifs
This class is specifically for reading files in the encode motif format - specifically the motifs.txt file that contains Pouya’s motifs
-
class
dragonn.synthetic.synthetic.
MinMaxWrapper
(quantityGenerator, theMin=None, theMax=None, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGenerator
Wrapper that restricts a distribution to only return values between the min and the max. If a value outside the range is returned, resamples until it obtains a value within the range. Warns if it resamples too many times.
-
class
dragonn.synthetic.synthetic.
OutsideCentralBp
(centralBp, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractPositionGenerator
Returns a position OUTSIDE the central region of a background sequence, sampled uniformly at random. Complement of InsideCentralBp.
-
class
dragonn.synthetic.synthetic.
PairEmbeddable
(string1, string2, separation, embeddableDescription, nothingInBetween=True)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddable
Represents a pair of strings that are embedded with some separation. Used for motif grammars. See superclass docs.
-
class
dragonn.synthetic.synthetic.
PairEmbeddableGenerator
(substringGenerator1, substringGenerator2, separationGenerator, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddableGenerator
-
class
dragonn.synthetic.synthetic.
PairEmbeddableGenerator_General
(embeddableGenerator1, embeddableGenerator2, separationGenerator, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddableGenerator
-
class
dragonn.synthetic.synthetic.
PairEmbeddable_General
(embeddable1, embeddable2, separation, embeddableDescription, nothingInBetween=True)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddable
embeds two Embeddable objects with some sep
-
class
dragonn.synthetic.synthetic.
PoissonQuantityGenerator
(mean, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGenerator
Generates values according to a poisson distribution
-
class
dragonn.synthetic.synthetic.
PriorEmbeddedThings_numpyArrayBacked
(seqLen)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractPriorEmbeddedThings
uses a numpy array where positions are set to 1 if they are occupied, to determin which positions are occupied and which are not. See parent for more documentation.
-
class
dragonn.synthetic.synthetic.
PwmSampler
(pwm, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGenerator
samples from the pwm by calling self.pwm.sampleFromPwm
-
class
dragonn.synthetic.synthetic.
PwmSamplerFromLoadedMotifs
(loadedMotifs, motifName, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.PwmSampler
convenience wrapper class for instantiating parent by pulling the pwm given the name from an AbstractLoadedMotifs object (it basically extracts the pwm for you)
-
class
dragonn.synthetic.synthetic.
RandomSubsetOfEmbedders
(quantityGenerator, embedders, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbedder
Takes a quantity generator that generates a quantity of embedders, and executes that many embedders from a supplied set, in sequence
-
class
dragonn.synthetic.synthetic.
RepeatedEmbedder
(embedder, quantityGenerator, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbedder
Wrapper around an embedder to call it multiple times according to sampling from a distribution.
-
class
dragonn.synthetic.synthetic.
RepeatedSubstringBackgroundGenerator
(substringGenerator, repetitions)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractBackgroundGenerator
-
class
dragonn.synthetic.synthetic.
ReverseComplementWrapper
(substringGenerator, reverseComplementProb=0.5, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGenerator
Wrapper around a AbstractSubstringGenerator that reverse complements it with the specified probability.
-
class
dragonn.synthetic.synthetic.
RevertToReference
(setOfMutations, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractTransformation
for a series of mutations, reverts the supplied string to the reference (“unmutated”) string
-
class
dragonn.synthetic.synthetic.
SampleFromDiscreteDistributionSubstringGenerator
(discreteDistribution)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGenerator
-
class
dragonn.synthetic.synthetic.
StringEmbeddable
(string, stringDescription='')[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddable
represents a string (such as a sampling from a pwm) that is to be embedded in a background. See docs for superclass.
-
class
dragonn.synthetic.synthetic.
SubstringEmbeddableGenerator
(substringGenerator, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddableGenerator
-
class
dragonn.synthetic.synthetic.
SubstringEmbedder
(substringGenerator, positionGenerator=<dragonn.synthetic.synthetic.UniformPositionGenerator object>, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.EmbeddableEmbedder
embeds a single generated substring within the background sequence, at a position sampled from a distribution. Only embeds at unoccupied positions
-
class
dragonn.synthetic.synthetic.
TopNMutationsFromPwmRelativeToBestHit
(pwm, N, bestHitMode)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSetOfMutations
See docs for parent; here, the collection of mutations are the top N strongest mutations for a PWM as compared to the best match for that pwm.
-
class
dragonn.synthetic.synthetic.
TopNMutationsFromPwmRelativeToBestHit_FromLoadedMotifs
(loadedMotifs, pwmName, N, bestHitMode)[source]¶ Bases:
dragonn.synthetic.synthetic.TopNMutationsFromPwmRelativeToBestHit
Like parent, except extracts the pwm.PWM object from an AbstractLoadedMotifs object, saving you a few lines of code.
-
class
dragonn.synthetic.synthetic.
TransformedSubstringGenerator
(substringGenerator, transformations, transformationsDescription='transformations', name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGenerator
Takes a substringGenerator and a set of AbstractTransformation objects, applies the transformations to the generated substring
-
class
dragonn.synthetic.synthetic.
UniformIntegerGenerator
(minVal, maxVal, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGenerator
Randomly samples an integer from minVal to maxVal, inclusive.
-
class
dragonn.synthetic.synthetic.
UniformPositionGenerator
(name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractPositionGenerator
samples a start position to embed the substring in uniformly at random; does not return positions that are too close to the end of the background sequence to embed the full substring.
-
class
dragonn.synthetic.synthetic.
XOREmbedder
(embedder1, embedder2, probOfFirst, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbedder
calls exactly one of the supplied embedders
-
class
dragonn.synthetic.synthetic.
ZeroInflater
(quantityGenerator, zeroProb, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGenerator
Wrapper that inflates the number of zeros returned. Flips a coin; if positive, will return zero - otherwise will sample from the wrapped distribution (which may still return 0)
-
class
dragonn.synthetic.synthetic.
ZeroOrderBackgroundGenerator
(seqLength, discreteDistribution=<dragonn.synthetic.util.DiscreteDistribution object>)[source]¶ Bases:
dragonn.synthetic.synthetic.RepeatedSubstringBackgroundGenerator
returns a sequence with 40% GC content. Each base is sampled independently.
-
dragonn.synthetic.synthetic.
generateString_zeroOrderMarkov
(length, discreteDistribution=<dragonn.synthetic.util.DiscreteDistribution object>)[source]¶ discreteDistribution: instance of util.DiscreteDistribution
-
dragonn.synthetic.synthetic.
printSequences
(outputFileName, sequenceSetGenerator, includeEmbeddings=False, labelGenerator=None, includeFasta=False)[source]¶ outputFileName: string sequenceSetGenerator: instance of AbstractSequenceSetGenerator Given an output filename, and an instance of AbstractSequenceSetGenerator, will call the sequence set generator and print the generated sequences to the output file. Will also create a file “info_outputFileName.txt” in the samedirectory as outputFileName that contains all the information about sequenceSetGenerator. includeEmbeddings: a boolean indicating whether to print a column that lists the embeddings labelGenerator: instance of LabelGenerator
-
dragonn.synthetic.synthetic.
printSequencesTransformationPosNeg
(outputFileNamePos, outputFileNameNeg, sequenceSetGenerator, transformation)[source]¶ outputFileName: string sequenceSetGenerator: instance of AbstractSequenceSetGenerator
generatedSequences: the sequences that have been generated by sequenceSetGeneratorGiven an output filename, and an instance of AbstractSequenceSetGenerator, will print the generated sequences to the output file. Will also create a file
“info_outputFileName.txt” in the same directory as outputFileName that contains all the information about sequenceSetGenerator.