dragonn.synthetic.synthetic module¶
-
class
dragonn.synthetic.synthetic.AbstractApplySingleMutationFromSet(setOfMutations, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractTransformationClass for applying a single mutation from a set of mutations; used to transform substrings generated by another method
-
class
dragonn.synthetic.synthetic.AbstractBackgroundGenerator[source]¶ Bases:
objectReturns the sequence that the embeddings are subsequently inserted into.
-
class
dragonn.synthetic.synthetic.AbstractEmbeddable[source]¶ Bases:
objectRepresents a thing which can be embedded. Note that an Embeddable + a position = an embedding.
-
canEmbed(priorEmbeddedThings, startPos)[source]¶ priorEmbeddedThings: instance of AbstractPriorEmbeddedThings startPos: the position you are considering embedding self at returns a boolean indicating whether self can be embedded at startPos,
given the things that have already been embedded.
-
embedInBackgroundStringArr(priorEmbeddedThings, backgroundStringArr, startPos)[source]¶ Will embed self at startPos in backgroundStringArr, and will update priorEmbeddedThings. priorEmbeddedThings: instance of AbstractPriorEmbeddedThings backgroundStringArr: an array of characters representing the background startPos: the position to embed self at
-
-
class
dragonn.synthetic.synthetic.AbstractEmbeddableGenerator(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixinGenerates an embeddable, usually for embedding in a background sequence.
-
class
dragonn.synthetic.synthetic.AbstractEmbedder(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixinclass that is used to embed things in a sequence
-
embed(backgroundStringArr, priorEmbeddedThings, additionalInfo=None)[source]¶ backgroundStringArr: array of characters representing the background string priorEmbeddedThings: instance of AbstractPriorEmbeddedThings. additionalInfo: instance of AdditionalInfo; allows the embedder to send back info about what it did modifies: backgroundStringArr to include whatever this class has embedded
-
-
class
dragonn.synthetic.synthetic.AbstractLoadedMotifs(fileName, pseudocountProb=0.0, background=OrderedDict([('A', 0.27), ('C', 0.23), ('G', 0.23), ('T', 0.27)]))[source]¶ Bases:
objectA class that contains instances of pwm.PWM loaded from a file. The pwms can be accessed by name.
-
class
dragonn.synthetic.synthetic.AbstractPositionGenerator(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixinGiven the length of the background sequence and the length of the substring you are trying to embed, will return a start position to embed the substring at.
-
class
dragonn.synthetic.synthetic.AbstractPriorEmbeddedThings[source]¶ Bases:
objectclass that is used to keep track of what has already been embedded in a sequence
-
addEmbedding(startPos, what)[source]¶ embeds “what” from startPos to startPos+len(what). Creates an Embedding object
-
canEmbed(startPos, endPos)[source]¶ returns a boolean indicating whether the region from startPos to endPos is available for embedding
-
-
class
dragonn.synthetic.synthetic.AbstractQuantityGenerator(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixinclass to sample according to a distribution
-
class
dragonn.synthetic.synthetic.AbstractSequenceSetGenerator[source]¶ Bases:
objectclass that is used to return a generator for a collection of generated sequences.
-
class
dragonn.synthetic.synthetic.AbstractSetOfMutations(mutationsArr)[source]¶ Bases:
objectRepresents a collection of pwm.Mutation objects
-
class
dragonn.synthetic.synthetic.AbstractSingleSequenceGenerator(namePrefix=None)[source]¶ Bases:
objectWhen called, generates a single sequence
-
class
dragonn.synthetic.synthetic.AbstractSubstringGenerator(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixinGenerates a substring, usually for embedding in a background sequence.
-
class
dragonn.synthetic.synthetic.AbstractTransformation(name)[source]¶ Bases:
dragonn.synthetic.synthetic.DefaultNameMixintakes an array of characters, applies some transformation, returns an array of characters (may be the same (mutated) one or a different one)
-
class
dragonn.synthetic.synthetic.AllEmbedders(embedders, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbedderWrapper around a list of embedders to make sure all are called Useful in conjunciton with RandomSubsetOfEmbedders
-
class
dragonn.synthetic.synthetic.BernoulliQuantityGenerator(prob, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGeneratorGenerates 1 or 0 according to a bernoulli distribution
-
class
dragonn.synthetic.synthetic.BestHitPwm(pwm, bestHitMode='pwmProb', name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGeneratoralways returns the best possible match to the pwm in question when called
-
class
dragonn.synthetic.synthetic.BestHitPwmFromLoadedMotifs(loadedMotifs, motifName, bestHitMode='pwmProb', name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.BestHitPwmconvenience wrapper class for instantiating parent by pulling the pwm given the name from an AbstractLoadedMotifs object (it basically extracts the pwm for you)
-
class
dragonn.synthetic.synthetic.ChooseMutationAtRandom(setOfMutations, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractApplySingleMutationFromSetSelects a mutation at random from self.setOfMutations to apply; see parent docs.
-
class
dragonn.synthetic.synthetic.ChooseValueFromASet(setOfPossibleValues, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGeneratorRandomly samples a particular value from a set of values
-
class
dragonn.synthetic.synthetic.EmbedInABackground(backgroundGenerator, embedders, namePrefix=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSingleSequenceGeneratorTakes a backgroundGenerator and a series of embedders. Will generate the background and then call each of the embedders in succession. Then returns the result.
-
class
dragonn.synthetic.synthetic.EmbeddableEmbedder(embeddableGenerator, positionGenerator=<dragonn.synthetic.synthetic.UniformPositionGenerator object>, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbedderEmbeds instances of AbstractEmbeddable within the background sequence, at a position sampled from a distribution. Only embeds at unoccupied positions
-
class
dragonn.synthetic.synthetic.Embedding(what, startPos)[source]¶ Bases:
objectRepresents something that has been embedded in a sequence
-
class
dragonn.synthetic.synthetic.FixedQuantityGenerator(quantity, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGeneratorreturns a fixed number every time generateQuantity is called
-
class
dragonn.synthetic.synthetic.FixedSubstringGenerator(fixedSubstring, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGeneratorWhen generateSubstring() is called, always returns the same string. The string also serves as its own description
-
class
dragonn.synthetic.synthetic.GenerateSequenceNTimes(singleSetGenerator, N)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSequenceSetGeneratorIf you just want to use a generator of a single sequence and call it N times, use this class.
-
class
dragonn.synthetic.synthetic.GeneratedSequence(seqName, seq, embeddings, additionalInfo)[source]¶ Bases:
objectAn object representing a sequence that has been generated.
-
class
dragonn.synthetic.synthetic.InsideCentralBp(centralBp, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractPositionGeneratorreturns a position within the central region of a background sequence, sampled uniformly at random
-
class
dragonn.synthetic.synthetic.LabelGenerator(labelNames, labelsFromGeneratedSequenceFunction)[source]¶ Bases:
object
-
class
dragonn.synthetic.synthetic.LoadedEncodeMotifs(fileName, pseudocountProb=0.0, background=OrderedDict([('A', 0.27), ('C', 0.23), ('G', 0.23), ('T', 0.27)]))[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractLoadedMotifsThis class is specifically for reading files in the encode motif format - specifically the motifs.txt file that contains Pouya’s motifs
-
class
dragonn.synthetic.synthetic.MinMaxWrapper(quantityGenerator, theMin=None, theMax=None, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGeneratorWrapper that restricts a distribution to only return values between the min and the max. If a value outside the range is returned, resamples until it obtains a value within the range. Warns if it resamples too many times.
-
class
dragonn.synthetic.synthetic.OutsideCentralBp(centralBp, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractPositionGeneratorReturns a position OUTSIDE the central region of a background sequence, sampled uniformly at random. Complement of InsideCentralBp.
-
class
dragonn.synthetic.synthetic.PairEmbeddable(string1, string2, separation, embeddableDescription, nothingInBetween=True)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddableRepresents a pair of strings that are embedded with some separation. Used for motif grammars. See superclass docs.
-
class
dragonn.synthetic.synthetic.PairEmbeddableGenerator(substringGenerator1, substringGenerator2, separationGenerator, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddableGenerator
-
class
dragonn.synthetic.synthetic.PairEmbeddableGenerator_General(embeddableGenerator1, embeddableGenerator2, separationGenerator, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddableGenerator
-
class
dragonn.synthetic.synthetic.PairEmbeddable_General(embeddable1, embeddable2, separation, embeddableDescription, nothingInBetween=True)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddableembeds two Embeddable objects with some sep
-
class
dragonn.synthetic.synthetic.PoissonQuantityGenerator(mean, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGeneratorGenerates values according to a poisson distribution
-
class
dragonn.synthetic.synthetic.PriorEmbeddedThings_numpyArrayBacked(seqLen)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractPriorEmbeddedThingsuses a numpy array where positions are set to 1 if they are occupied, to determin which positions are occupied and which are not. See parent for more documentation.
-
class
dragonn.synthetic.synthetic.PwmSampler(pwm, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGeneratorsamples from the pwm by calling self.pwm.sampleFromPwm
-
class
dragonn.synthetic.synthetic.PwmSamplerFromLoadedMotifs(loadedMotifs, motifName, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.PwmSamplerconvenience wrapper class for instantiating parent by pulling the pwm given the name from an AbstractLoadedMotifs object (it basically extracts the pwm for you)
-
class
dragonn.synthetic.synthetic.RandomSubsetOfEmbedders(quantityGenerator, embedders, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbedderTakes a quantity generator that generates a quantity of embedders, and executes that many embedders from a supplied set, in sequence
-
class
dragonn.synthetic.synthetic.RepeatedEmbedder(embedder, quantityGenerator, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbedderWrapper around an embedder to call it multiple times according to sampling from a distribution.
-
class
dragonn.synthetic.synthetic.RepeatedSubstringBackgroundGenerator(substringGenerator, repetitions)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractBackgroundGenerator
-
class
dragonn.synthetic.synthetic.ReverseComplementWrapper(substringGenerator, reverseComplementProb=0.5, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGeneratorWrapper around a AbstractSubstringGenerator that reverse complements it with the specified probability.
-
class
dragonn.synthetic.synthetic.RevertToReference(setOfMutations, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractTransformationfor a series of mutations, reverts the supplied string to the reference (“unmutated”) string
-
class
dragonn.synthetic.synthetic.SampleFromDiscreteDistributionSubstringGenerator(discreteDistribution)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGenerator
-
class
dragonn.synthetic.synthetic.StringEmbeddable(string, stringDescription='')[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddablerepresents a string (such as a sampling from a pwm) that is to be embedded in a background. See docs for superclass.
-
class
dragonn.synthetic.synthetic.SubstringEmbeddableGenerator(substringGenerator, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddableGenerator
-
class
dragonn.synthetic.synthetic.SubstringEmbedder(substringGenerator, positionGenerator=<dragonn.synthetic.synthetic.UniformPositionGenerator object>, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.EmbeddableEmbedderembeds a single generated substring within the background sequence, at a position sampled from a distribution. Only embeds at unoccupied positions
-
class
dragonn.synthetic.synthetic.TopNMutationsFromPwmRelativeToBestHit(pwm, N, bestHitMode)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSetOfMutationsSee docs for parent; here, the collection of mutations are the top N strongest mutations for a PWM as compared to the best match for that pwm.
-
class
dragonn.synthetic.synthetic.TopNMutationsFromPwmRelativeToBestHit_FromLoadedMotifs(loadedMotifs, pwmName, N, bestHitMode)[source]¶ Bases:
dragonn.synthetic.synthetic.TopNMutationsFromPwmRelativeToBestHitLike parent, except extracts the pwm.PWM object from an AbstractLoadedMotifs object, saving you a few lines of code.
-
class
dragonn.synthetic.synthetic.TransformedSubstringGenerator(substringGenerator, transformations, transformationsDescription='transformations', name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractSubstringGeneratorTakes a substringGenerator and a set of AbstractTransformation objects, applies the transformations to the generated substring
-
class
dragonn.synthetic.synthetic.UniformIntegerGenerator(minVal, maxVal, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGeneratorRandomly samples an integer from minVal to maxVal, inclusive.
-
class
dragonn.synthetic.synthetic.UniformPositionGenerator(name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractPositionGeneratorsamples a start position to embed the substring in uniformly at random; does not return positions that are too close to the end of the background sequence to embed the full substring.
-
class
dragonn.synthetic.synthetic.XOREmbedder(embedder1, embedder2, probOfFirst, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractEmbeddercalls exactly one of the supplied embedders
-
class
dragonn.synthetic.synthetic.ZeroInflater(quantityGenerator, zeroProb, name=None)[source]¶ Bases:
dragonn.synthetic.synthetic.AbstractQuantityGeneratorWrapper that inflates the number of zeros returned. Flips a coin; if positive, will return zero - otherwise will sample from the wrapped distribution (which may still return 0)
-
class
dragonn.synthetic.synthetic.ZeroOrderBackgroundGenerator(seqLength, discreteDistribution=<dragonn.synthetic.util.DiscreteDistribution object>)[source]¶ Bases:
dragonn.synthetic.synthetic.RepeatedSubstringBackgroundGeneratorreturns a sequence with 40% GC content. Each base is sampled independently.
-
dragonn.synthetic.synthetic.generateString_zeroOrderMarkov(length, discreteDistribution=<dragonn.synthetic.util.DiscreteDistribution object>)[source]¶ discreteDistribution: instance of util.DiscreteDistribution
-
dragonn.synthetic.synthetic.printSequences(outputFileName, sequenceSetGenerator, includeEmbeddings=False, labelGenerator=None, includeFasta=False)[source]¶ outputFileName: string sequenceSetGenerator: instance of AbstractSequenceSetGenerator Given an output filename, and an instance of AbstractSequenceSetGenerator, will call the sequence set generator and print the generated sequences to the output file. Will also create a file “info_outputFileName.txt” in the samedirectory as outputFileName that contains all the information about sequenceSetGenerator. includeEmbeddings: a boolean indicating whether to print a column that lists the embeddings labelGenerator: instance of LabelGenerator
-
dragonn.synthetic.synthetic.printSequencesTransformationPosNeg(outputFileNamePos, outputFileNameNeg, sequenceSetGenerator, transformation)[source]¶ outputFileName: string sequenceSetGenerator: instance of AbstractSequenceSetGenerator
generatedSequences: the sequences that have been generated by sequenceSetGeneratorGiven an output filename, and an instance of AbstractSequenceSetGenerator, will print the generated sequences to the output file. Will also create a file
“info_outputFileName.txt” in the same directory as outputFileName that contains all the information about sequenceSetGenerator.