Chemical Identification


This is a decent on-line periodic table.

Periodic Table

This map is exceptionally clever and useful. Elements


One of the best pieces of evidence that chemistry is a sloppy field is the existence of CAS numbers. This stands for "Chemical Abstracts Service". The problem with CAS numbers is that the mapping between chemical and number is established and controlled by a proprietary non-public group, the American Chemical Society. This makes it impossible for poorly funded groups and individuals to work with large quantities of these numbers.

Even though most chemists use these numbers most of the time, I believe it is wrong to do so and they should be avoided categorically. It is especially insidious when important government safety regulations refer to chemicals exclusively by this closed system designed for insiders.


The correct way to refer to chemicals is by their SMILES string.

Here is a thorough reference from Molsoft. ICM can generate models based on SMILES strings.

build smiles 'O=C(OCCC)c1cc(O)c(O)c(O)c1'

Simplified Molecular Input Line Entry System SMILES is a way to represent the graph of chemical structures, i.e. what atoms are present and how do they connect to each other. The SMILES string is obtained by listing the nodes encountered in a depth-first tree traversal after the graph is turned into a spanning tree by breaking any cycles. Where cycles have been broken, numbers are used to indicate the breaks. Parentheses are used to indicate points of branching on the tree. Implicit hydrogens are included.

  • [] - Elements with no implicit hydrogens, so [O] is an oxygen atom alone while water is simply O. This seems to work for groups too. Only B ,C, N, O, P, S, F, Cl, Br, and I can be used without brackets.

  • () - Branches.

  • = - Double bond.

  • # - Triple bond.

  • $ - Quadruple bond.

  • % - Means there are more than 9 bonds and the next N numbers are the same label.

  • / - Stereochemistry. cis vs. trans (across)

  • \ - Stereochemistry.

  • @ - Stereochemistry, tetrahedral.

Isotopes are written [14c] for Carbon-14.


Although SMILES is rarely insufficient to convey what is needed about a molecule, there are more verbose formats which are extremely common. Although SDF is an embarrassingly terrible data format, it is nearly universal in biochemistry.

SDF stands for "Structure Data File" and is an extension of the "molfile" format. The relationship is described in this Wikipedia article. Here is a proper definition of the format.

Here are the main important points of an SDF file.

benzene                                 <= Molecule name.
ACD/Labs0812062058                      <= User/Program/Date/Who knows?
                                        <= Comment. Required. Often blank.
 6  6  0  0  0  0  0  0  0  0  1 V2000  <= 6 atoms, 6 bonds, Version 2k or 3k.
   1.9050   -0.7932    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   1.9050   -2.1232    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   0.7531   -0.1282    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
   0.7531   -2.7882    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  -0.3987   -0.7932    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
  -0.3987   -2.1232    0.0000 C   0  0  0  0  0  0  0  0  0  0  0  0
 2  1  1  0  0  0  0                    \= Atom block. Enumerates atoms.
 3  1  2  0  0  0  0                    <= The bond block.
 4  2  2  0  0  0  0                       Describes atom relationships.
 5  3  1  0  0  0  0
 6  4  1  0  0  0  0
 6  5  2  0  0  0  0
M  STY  1   1 SUP                       <= Properties block. (S group type)
M  SLB  1   1   1                          (S group label) No properties required.
M  SAL   1  4  26  27  28  29              (S group atom list) Examples only.
M  SBL   1  1   1                          (S group bond list)
M  SMT   1 CF3                             (S group subscript)
M  SBV   1   1    0.0000   -0.8250         (Super atom bond and vector)
M  END                                  <= End of the M section block
>  <Vendor>                             <= Data header.
The Big Energy Corp.                    <= Data for that property.
                                        <= Separate with blank line.
>  <Molecular Weight>                   <= As many properties as you need.
$$$$                                    <= Record separator.
The fields of the atom block.
  • X, Y, Z coordinates

  • Periodic table chemical symbol

  • 1- Mass difference (maybe unnecessary)

  • 2- Charge (maybe superfluous and optional)

  • 3- Stereo parity (often ignored)

  • 4- Hydrogen count

  • 5- "Stereo care" flag (wastes 3 bytes for one mostly useless bit)

  • 6- Valence or number of bonds to this atom (redundant, optional)

  • 7- HO indicator (redundant, ignored)

  • 8- Not used!

  • 9- Not used!

  • 10- "Atom mapping number" (no more information)

  • 11- "Inversion retention flag" (0,1, or 2)

  • 12- "Exact change flag" (1 bit 0 or 1)

The fields of the bond block.
  • First atom’s position in atom block.

  • Second atom’s position in atom block.

  • Type (1=single,2=double,3=triple,4=aromatic,5=1or2,6=1or4,7=2or4,8=any)

  • Stereo (0=not,1=up,4=either,3=cis/trans,6=down; pointy end is first atom).

  • Not used!

  • Bond topology (0=either,1=ring,2=chain)

  • Reacting center status (0=unmarked,1=center,-1=not,2=no change, 4=bond made/broken,8=bond order changes, 12=4&8,5=4&1,9=8&1,13=12&1)


A molecule or ion capable of donating a proton. A proton can be thought of as a hydrogen ion H+.

pH is approximately the negative of the logarithm to base 10 of the molar concentration, measured in units of moles per liter, of hydrogen ions. A pH of less than 7 is acidic, greater is basic.

pH= Power Hydrogen.

Proper pH for a swimming pool is 7.35, the pH of the human eye. This is slightly basic.

  • Release H+ ions in aqueous solution

  • Lower pH than 7

  • Blue paper turns red

  • Release hydroxide ions in aqueous solution

  • Strong bases often end in "hydroxide", e.g. lithium hydroxide, sodium hydroxide, potassium hydroxide, and calcium hydroxide

  • Red paper turns blue

  • Includes soaps

  • Higher pH than 7

Table 1. Typical pH



Very basic


Milk of magnesia


Baking soda


Human blood





Milk (Cows?)


Human skin




Sour milk






Battery Acid


Hydrochloric Acid

Very Acidic


When acids and bases react, they form water and a salt.



A phosphate is salt of phosphoric acid. It is characterized by a group based on a phosphorous atom surrounded by 4 oxygen atoms in a tetrahedral arrangement. 3 of the oxygens are negatively charged and the other has a double bond to the phosphorous. Adding phosphates (phosphorylation) and removing them (dephosphorylation) from proteins are key energy managing strategies for living things.

Phosphates usually pick up hydrogens or they are negatively charged. They can also pick up other things such as sodium producing mono-, di-, and tri[sodium phosphate].


Here is a superb chart of organic compounds organized by smell.


Hydroxyl = a chemical unit consisting of a hydrogen attached to an oxygen which is attached to something else. I believe these can be found on free floating amino acids where they would otherwise be attached to a protein. Serine, Threonine, and Tyrosine all seem to have hydroxy groups in the main part of the molecule too.

Free radicals seem to be a related topic.


Alcohol = If an -OH hydroxyl is bonded to a carbon in a molecule that contains only carbon and hydrogen, then the molecule is, by virtue of the -OH, an alcohol. Alcohols tend to end in ol like ethanol (human intoxicating), ethylene glycol (antifreeze), diethylene glycol (infamous contaminant), methanol (fuel), menthol (analgesic).

Drinking alcohol is ethanol and is an acyclic chain of two single bond carbons and a hydroxyl with all the proper hydrogens.


Esters are derived from an acid (organic or inorganic) in which at least one -OH (hydroxyl) group is replaced by an -O-alkyl (alkoxy) group. Usually, esters are derived from a carboxylic acid and an alcohol.

The key features of an ester is a carbon with the following connections. * Double bond to oxygen. * Single bond to some other moiety (often labeled R). Optionally just hydrogen. * Single bond to an oxygen which itself is bound to some other moiety (R'). Or just -1 charged oxygen.

Some esters.

  • Glycerides - fatty acid esters of glycerol, main biological lipid class (animal fat and vegetable oil).

  • Fragrances and pheromones - often low molecular weight esters.

  • Phosphoesters - backbone of DNA.

  • Nitrate esters - explosive (e.g. nitroglycerin).

  • Polyesters - plastics made of monomers linked with esters.

An esterase is a hydrolase enzyme that splits esters into an acid and an alcohol in a chemical reaction with water called hydrolysis. The one I’ve heard most about is acetylcholinesterase which catalyzes the breakdown of acetylcholine and of some other choline esters that function as neurotransmitters. Neurotransmitters are of interest to migraine pathology.

Polycyclic Aromatic Hydrocarbon

Lattices of rings of carbon and hydrogen (only). Like chicken wire. Neutral, nonpolar. Can be produced by incomplete combustion of organic matter. Can be found in tar an asphalt runoff. Also found in smog, marine oil spills.

Simplest is naphthalene which just has two rings.

Believed to be carcinogenic.


A Methyl group involves one carbon atom bonded to three hydrogen atoms. The fourth carbon bond can attach to some other part of the molecule. If that is just a fourth hydrogen then the molecule is methane.

Methylation and demethylation is common and involves moving the methyl group to another compound.

If the methyl group is attached to an -OH it becomes methanol.


A Butyl group is a four carbon set in various configurations. A normal butyl group is something like: R/\/\ Where the R is a radical with 4 carbons tailing off it. A tert-butyl group is similar except the carbons are in a tetrahedron with the radical linking to one of the non central carbons.


Saturated compounds are composed of carbon chains that only contain single bonds. Unsaturated molecules can contain double and triple carbon bonds. Alkene and alkyne are unsaturated.


An alkane consists of hydrogen and carbon atoms arranged in an acyclic tree structure in which all the carbon-carbon bonds are single. When all carbon bonds are single, the molecule is called a saturated hydrocarbon. The simplest alkane is methane which is just a carbon atom with 4 hydrogens (CH4).

Alkanes sometimes are called paraffin but paraffin may not always automatically mean Alkane now. If the backbone has more than about 17 carbons, it is usually called a wax. They are not very reactive and have relatively little biological activity.

Oil and natural gas contain alkanes (and a bunch of other sludge).

Table 2. Alkanes

1 carbons


2 carbons


3 carbons


4 carbons


5 carbons


6 carbons


7 carbons


8 carbons


There are linear and branched alkanes. There are also cyclic alkanes, called cycloalkanes which are like regular alkanes but they form loops or cycles.

Table 3. Cycloalkane bond angles

3 carbons



4 carbons



5 carbons



6 carbons




An unsaturated hydrocarbon that contains at least one carbon-carbon double bond. The simplest is a pair of double bond carbons each with 2 hydrogens, C2H4 or ethylene or ethene. Alkenes sometimes are called olefin and olefine and eth.

Propylene, isobutylene, and vinyl (aka ethenyl) are alkenes.


An ethyl group is a pair of carbon atoms double bonded optionally attached to additional atoms. Something like: R-CH2CH3


An unsaturated hydrocarbon containing at least one carbon-carbon triple bond. The simplest exemplar is ethyne, a.k.a. acetylene which is just two carbons bound together with a triple bond and a hydrogen on each, C2H2.

Biological Chemistry


Teixobactin is an interesting oligopeptide.


Cellular fluid is called Cytosol. Rough concentrations also here. Cytoplasm intercellular fluid containing organelles; cytosol is the cytoplasm minus the organelles.

Composition of cytosol
Knowable things
  • water concentration

  • Ph

  • Atomic proportions?

  • ?

Things I can think of that may be floating around

Cytosol References

  • Macromolecular crowding: obvious but underappreciated 2001 source similar


UV Filters

  • Benzophenone

  • Para-amino Benzoic Acid (PABA)

  • Zinc Oxide

  • Titanium Oxide


Note the cautionary story of Linus Pauling who wrongly popularized the anti-oxidant mania, especially vitamin C and vitamin E.

  • Tocopherols (Vitamin E) - "A potential confounding factor is the form of Vitamin E used in these studies. As explained earlier, synthetic, racemic mixtures of Vitamin E isomers are not bioequivalent to natural, non-racemic mixtures, yet are widely used academically and commercially."

  • Ascorbic Acid (Vitamin C)

  • Erythorbic Acid

  • Butylated Hydroxyanisole

  • Butylated Hydroxytoluene

  • Sodium Citrate

  • Lecithin

  • Propyl Gallate - Estrogen antagonist? Carcinogenesis?


  • Sodium Benzoate

  • Benzoic Acid

  • Potassium Sorbate

  • Sorbic Acid

  • Natamycin

  • Triclosan

  • Triclocarban

  • Hexachlorophene

  • Acetic Acid

  • Sodium Chloride

  • Calcium Propionate

  • Imidazolidinyl Urea

  • Methylchloroisothiazolinone

  • Lactic Acid

  • Sodium Nitrate

  • Sodium Nitrite

  • DMDM Hydantoin

  • Glycols

  • 2-bromo-2-nitropropane-1,3-diol


  • Glucose

  • Fructose

  • Sucrose

  • Galactose

  • Mannose

  • Dextrose

  • HFCS

  • Corn Syrup

  • Honey

  • Aspartame

  • Saccharine

  • Neotame

  • Sucralose

  • Acesulfame Potassium

  • Invert sugar

  • Xylitol

  • Tagatose

  • Maltitol

  • Maltose

  • Trehalose

  • Lactose

  • Hydrogenated Starch Hydrosylate


  • Sodium Citrate

  • Aminomethyl Propanol (AMP)

  • Tetrasodium Pyrophosphate

  • Phosphoric Acid

Chelating/Sequestering Agents

  • Ethylene Diamine Tetra Acetic Acid (EDTA)

  • Diethylene Triamine Pentaacetic Acid

  • Phosphoric Acid (again?)

  • Tetrasodium Etidronate

  • Sodium Carbonate


  • Ethanol

  • Stearyl Alcohol

  • Cetyl Alcohol

  • Glycerin

  • Menthol


  • Esters


  • Acetic Acid

  • Citric Acid

  • Lactic Acid

  • Stearic Acid

  • Phosphoric Acid

  • Fumaric Acid

  • Tartaric Acid

  • Methyl Vanillin

  • Ethyl Vanillin

  • Denatonium Benzoate

  • Vanilla

  • Monosodium Glutamate


  • Potassium Chloride


  • Tristearin

  • Trilinolein

  • Olestra

  • Salatrim

  • Guar Gum

  • Locust Bean Gum

  • Xanthan Gum


  • Annatto

  • Beta-carotene

  • Carmine

  • Saffron

  • Turmeric

  • Titanium Dioxide

  • Allura Red

  • Indigo

  • Caseinate

  • Ferrous Gluconate

  • D&C Green No. 5

  • D&C Red No. 33

  • D&C Violet No. 2

  • FD&C Yellow No. 5

  • Ext. D&C Violet No. 2

Moisture Control

  • Glycerin

  • Sorbitol

  • Sodium PCA

  • Mannitol

  • Propylene Glycol

  • Butylene Glycol


  • Lecithin

  • Phosphoric Acid

  • Sorbitan Monostearate

  • Polysorbate 80

  • Glycerol Monostearate (Mono- and Diglycerides)

Stabilizers and Thickeners

  • Sodium Caseinate

  • Calcium Caseinate

  • Polyethylene Glycol (PEG)

  • Polypropylene Glycol (PPG)

  • Lecithin

  • Methylcellulose

  • Sodium Carboxymethylcellulose

  • Xylenesulfonates

  • Agar

  • Gelatin

  • Pectin

  • Alginates

  • Starch and Modified Starch

  • Carrageenan

  • Guar Gum

  • Locust Bean Gum

  • Brominated Vegetable Oil

  • Gum Arabic

  • Xanthan Gum

Dough Conditioners and Whipping Agents

  • Sodium Stearoyl Lactylate

  • Calcium Stearoyl Lactylate

  • Sodium Stearoyl Fumarate

  • Potassium Bromate

  • Tetrasodium Pyrophosphate

  • Fumaric Acid


  • Caffeine

  • Theobromine

  • Ephedrine


  • Salicylic Acid

  • Sulfur

  • Resorcinol

  • Sodium Bicarbonate

  • Hydroquinone

  • Potassium Nitrate

  • Benzocaine

  • Tramadol

  • Acetylsalicylic Acid (Aspirin)

  • Acetaminophen

  • Ibuprofen

  • Naproxen

  • Allantoin

  • Menthol

  • Methyl Salicylate, Ethyl Salicylate, Glycol Salicylate

  • Camphor

  • Methyl Nicotinate, Benzyl Nicotinate

  • Capsaicin


  • Sodium Hypochlorite

  • Calcium Hypochlorite

  • Hydrogen Peroxide

  • Benzoyl Peroxide

  • Borax

  • Sodium Perborate

  • Sodium Carbonate Peroxide


  • Ammonium Lauryl Sulfate

  • Sodium Lauryl Sarcosinate

  • Lauryl Glucoside

  • Cocamidopropyl Betaine

  • Sodium Dodecylbenzenesulfonate

  • Sodium Isethionate

Foam Stabilizers

  • Cocamide MEA, Cocamide DEA, Cocamide TEA

  • Tetrasodium Pyrophosphate


  • Cetyl Alcohol

  • Cetrimonium Chloride

  • Silicones

  • Panthenol


  • Nitrous Oxide

  • Isobutane

  • Dimethyl Ether

Polymers and Glue

  • Vinyl Acetate

  • Vinyl Alcohol

  • Methacrylate


  • Hydrated Silica

  • Stannous Fluoride

  • Sodium Fluoride

  • Sodium Monofluorophosphate

Unknown (Found In Perfume)

  • Ethylhexyl Methoxycinnamate

  • Limonene Besides as a fragrance (oranges) used both as a flavoring and an insecticide.

  • Butyl Methoxydibenzoylmethane

  • Ethylhexyl Salicylate

  • Linalool

  • Hexyl Cinnamal

  • Citronellol

  • Hydroxycitronellal Perfume odorant, not much known.

  • Coumarin

  • Butylphenyl Methylpropional

  • Alpha-Isomethyl Ionone

  • Citral

  • Geraniol

  • Isoeugenol

  • Benzyl Benzoate

  • Tromethamine

  • Benzyl Alcohol

  • Tetramethylhydroxypiperidinol Citrate (TRIS)

  • Benzyl Salicylate

  • Hydroxyisohexyl 3-Cyclohexene

  • Carboxaldehyde

  • Farnesol

  • Cinnamyl Alcohol

  • Anthranilate

Rocket Fuel

Titan II rockets used a hypergolic (mix and it ignites) fuel. It was a 50-50 blend of hydrazine and unsymmetrical dimethyl hydrazine (brand name: Aerozine 50). The oxidizer was nitrogen tetroxide.

Drugs and Pharmacology

  • agonist - ligand that binds to the main (orthosteric) binding site and causes the protein’s effect to occur.

  • antagonist - ligand that binds to the main binding site and causes the protein’s effect not to occur. A.k.a blockers. They occupy the binding site but in a way that does not trigger the protein’s action. Calcium channel blockers are an example. Question: are antagonists more likely to be smaller than agonists?

  • inverse agonist - ligand that binds to the main binding site and causes it to do the opposite of what the normal agonist would do.

allosteric modulator - Allosteric modulators bind to a site distinct from that of the orthosteric agonist binding site. Usually they induce a conformational change within the protein structure. Indirectly influences (modulates) the effects of an agonist or inverse agonist at a target protein.

  • orthosteric binding site - the main binding site for the main agonist.

  • positive allosteric modulator (PAM) - a.k.a. allosteric enhancer, induces an amplification of the orthosteric agonist’s effect, either by enhancing the binding affinity or the functional efficacy of the orthosteric agonist for the target protein.

  • negative allosteric modulator (NAM) - attenuates the effects of the orthosteric ligand, but is inactive in the absence of the orthosteric ligand.

  • silent allosteric modulator (SAM) - occupies the allosteric binding site but are functionally neutral.