Supramolecular insight into the substitution of sulfur by selenium, based on crystal structures, quantum-chemical calculations and biosystem recognition

Statistical analysis of data from crystal structures extracted from the Cambridge Structural Database (CSD) has shown that S and Se atoms display a similar tendency towards specific types of interaction if they are part of a fragment that corresponds to the side chains of cysteine (Cys), methionine (Met) selenocysteine (Sec) and selenomethionine (Mse). The most numerous are structures with C—H Se and C—H S interactions ( 80%), notably less numerous are structures with Se Se and S S interactions ( 5%), and Se and S interactions are the least numerous. The results of quantum-chemical calculations have indicated that C—H Se ( 0.8 kcal mol ) and C—H S interactions are weaker than the most stable parallel interaction ( 3.3 kcal mol ) and electrostatic interactions of / type ( 2.6 kcal mol ). Their significant presence can be explained by the abundance of CH groups compared with the numbers of Se and S atoms in the crystal structures, and also by the influence of substituents bonded to the Se or S atom that further reduce their possibilities for interacting with species from the environment. This can also offer an explanation as to why O—H Se ( 4.4 kcal mol ) and N—H Se interactions ( 2.2 kcal mol ) are less numerous. Docking studies revealed that S and Se rarely participate in interactions with the amino acid residues of target enzymes, mostly because those residues preferentially interact with the substituents bonded to Se and S. The differences between Se and S ligands in the number and positions of their binding sites are more pronounced if the substituents are polar and if there are more Se/S atoms in the ligand.

Statistical analysis of data from crystal structures extracted from the Cambridge Structural Database (CSD) has shown that S and Se atoms display a similar tendency towards specific types of interaction if they are part of a fragment that corresponds to the side chains of cysteine (Cys), methionine (Met) selenocysteine (Sec) and selenomethionine (Mse). The most numerous are structures with C-HÁ Á ÁSe and C-HÁ Á ÁS interactions ($80%), notably less numerous are structures with SeÁ Á ÁSe and SÁ Á ÁS interactions ($5%), and SeÁ Á Á and SÁ Á Á interactions are the least numerous. The results of quantum-chemical calculations have indicated that C-HÁ Á ÁSe ($À0.8 kcal mol À1 ) and C-HÁ Á ÁS interactions are weaker than the most stable parallel interaction ($À3.3 kcal mol À1 ) and electrostatic interactions of / type ($À2.6 kcal mol À1 ). Their significant presence can be explained by the abundance of CH groups compared with the numbers of Se and S atoms in the crystal structures, and also by the influence of substituents bonded to the Se or S atom that further reduce their possibilities for interacting with species from the environment. This can also offer an explanation as to why O-HÁ Á ÁSe ($À4.4 kcal mol À1 ) and N-HÁ Á ÁSe interactions ($À2.2 kcal mol À1 ) are less numerous. Docking studies revealed that S and Se rarely participate in interactions with the amino acid residues of target enzymes, mostly because those residues preferentially interact with the substituents bonded to Se and S. The differences between Se and S ligands in the number and positions of their binding sites are more pronounced if the substituents are polar and if there are more Se/S atoms in the ligand.

Introduction
Selenium and sulfur are basic elements with closely related properties, found in nature in a vast array of active compounds, and also used as reactants of numerous chemical and biochemical processes. Selenium is an essential trace element found in selenoproteins in the form of the selenium analogue of cysteine (selenocysteine, Sec). Although it is toxic at higher concentrations, recent research indicates that it might also be effective in preventing different types of disease (Vinceti et al., 2019;Li et al., 2019;Raygan et al., 2019), mostly because of its high chemical reactivity in metabolism. As these elements are very similar when it comes to electronegativity (2.58 for sulfur and 2.55 for selenium), number of oxidation states and ionic radius, they also share many properties (Yoshizawa & Bö ck, 2009) However, a few important differences between these amino acids should be pointed out. The pK a of Sec is much lower (5.3) than that of Cys (8.3) (Allmang & Krol, 2006;Muttenthaler & Alewood, 2008) which makes Sec a considerably more potent nucleophile under neutral and acidic conditions. In addition, selenium is softer than sulfur, with a polarizability volume of 3.8 Å (compared with just 2.9 Å for S) (Steinmann et al., 2010).
Non-covalent interaction has been identified as a driving force for crystal packing (Caracelli et al., 2012), self-assembly of (macro)molecules (Voth et al., 2009), stabilization of different systems (Wang et al., 2019;Breugst et al., 2017) and biological recognition (Daze & Hof, 2016). Hydrogen bonds (HBs), a class of non-covalent interaction, are generally the most abundant and the most studied.
It is also worth mentioning that the importance of various classes of non-covalent interaction, such as those involving the tetrels (Group 14 in the periodic table), pnictogens (Group 15), chalcogens (Group 16), halogens (Group 17) and the noble gases (Group 18), has been recognized over the years (Bauzá et al., 2013;Bauzá & Frontera, 2018;Metrangolo & Resnati, 2001;Brammer, 2017;Benz et al., 2018;Politzer et al., 2014;Bauzá & Frontera, 2015;Frontera & Bauzá , 2017). In the oxygen family, the non-covalent interactions involving O and S have been extensively investigated. By comparison, interactions that involve Se and Te have only recently gained more attention. In organoselenium compounds, selenium is most frequently present in the divalent state, and as such displays two -holes, electropositive regions located along the bonding axis and perpendicular to the two lone pairs of valence electrons (Murray et al., 2009).
Intermolecular N-HÁ Á ÁSe interactions were observed more than 50 years ago by Hope (1965), while intramolecular C-HÁ Á ÁSe interactions were characterized by Iwaoka & Tomoda (1994). Further spectroscopic and crystal structure investigations confirmed the presence and significance of selenium hydrogen-bonding interactions within biomacromolecular systems, the crystal packing, and the synthesis of chiral selenones (Mukherjee et al., 2010). More recently, a modest number of theoretical or combined studies involving X-HÁ Á ÁSe have been carried out. The influence of various electron-donating and -withdrawing substituents (Q) in the para position of cationic phenol [4-Q-C 6 H 4 -OHÁ Á ÁSeH 2 ] + was investigated by Das et al. (2017). Madzhidov and Chmutova analysed the nature of X-HÁ Á ÁSe hydrogen-bonded complexes of several organoselenium compounds using quantum-chemistry methods. They pointed out the unusual electrostatic repulsion between Se and H atoms located opposite to classical HÁ Á ÁN or HÁ Á ÁO contacts, but the greater stabilization effect of charge transfer and covalence in the former (Madzhidov & Chmutova, 2010). The implications of S-HÁ Á ÁS and Se-HÁ Á ÁSe in determining the molecular arrangement in the crystal packing of thiophenol and selenophenol were reported by Thomas et al. (2015).
Non-covalent interactions of Se(lone pair)Á Á Á(aryl) or Se(lone pair)Á Á Á(lone pair) type are probably the most studied and well documented. Small molecules retrieved from the Cambridge Structural Database (CSD; Groom et al., 2016) have been used as model systems to study the interactions between an Se atom and aromatic molecules. In their paper, Hartman et al. (2006) came to the conclusion that these noncovalent interactions may participate in the stabilization of Secontaining drugs and protein active sites. Caracelli and coworkers have pointed out that these Se(lone pair)Á Á Á(aryl) interactions lead to zero-or one-dimensional aggregation patterns, based on a crystal structure database search of selenium/aryl-containing compounds (Caracelli et al., 2012). In the last few years, Saberinasab and co-workers have studied the effects of electron-donating and electron-withdrawing substituents R = H, F, Cl, CH 3 on R 2 SeÁ Á Ábenzene interactions and the effect of a strong cationÁ Á Á interaction in the R 2 SeÁ Á ÁC 6 H 6 Á Á ÁM + model system (M + = Li + , Na + and R = H, CH 3 ) (Saberinasab et al., 2016a,b). A theoretical study published recently explored the Se-HÁ Á Á(aryl) and Se(lone pair)Á Á Á(aryl) interactions between an H 2 Se group and unsubstituted or monosubstituted benzene molecules, and these results were further compared with an analogous series of sulfur-benzene dimers. The study found that selenium interacts more strongly with aromatic systems than sulfur, which could be important in the stabilization and inhibition of proteins, indicating that the Se-H group is a better donor than the S-H group when it comes to hydrogen bonding (Senćanski et al., 2017).
Non-covalent interactions of SeÁ Á ÁX type (X = heteroatom) have also attracted a great deal of attention. Sanz and coworkers have published several articles on the competition in strength between O-HÁ Á ÁSe or S-HÁ Á ÁSe intramolecular hydrogen bonds and OÁ Á ÁSe and SÁ Á ÁSe interactions (Sanz et al., 2002(Sanz et al., , 2003a. The analysis of crystal structures with an Se-SeÁ Á ÁX fragment (X = Se, S, O) has shown the correlation between the Se-Se bond length and the strength of the SeÁ Á ÁX interaction. The trend in Se-Se distances for different Se-SeÁ Á ÁX fragments follows the sequence of X as Se > S > O (Linden et al., 2014). Iwaoka and co-workers made an important contribution to the study of SeÁ Á ÁO, SeÁ Á ÁN and SeÁ Á Áhalogen interactions (Iwaoka et al., 2002a(Iwaoka et al., ,b, 2004Komatsu et al., 1999;Iwaoka & Tomoda, 1996). So far, several important papers have been published concerning chalcogenchalcogen SeÁ Á ÁS, SeÁ Á ÁSe and SeÁ Á ÁTe types of interaction. In one systematic approach, Tiecco and co-workers reported that SeÁ Á ÁS interactions have a high directionality influence on asymmetric syntheses. They came to the conclusion that SeÁ Á ÁS contacts have a more pronounced selectivity in comparison with the influence originating from SeÁ Á ÁO or SeÁ Á ÁN (Tiecco et al., 2000(Tiecco et al., , 2002(Tiecco et al., , 2003(Tiecco et al., , 2004(Tiecco et al., , 2006a. The same authors also offered experimental evidence for the presence of SeÁ Á ÁS interactions in the crystal form, as well as in CDCl 3 solution (Tiecco et al., 2006b). Also, a few scientific papers deal with SeÁ Á ÁSe and SeÁ Á ÁTe interactions in an attempt to give a better insight into the nature of these interactions (Ibrahim & Safy, 2019;Bleiholder et al., 2007;Sá nchez-Sanz et al., 2011).
Theoretical characterization of the impact of sulfur-toselenium substitution on hydrogen-bonding potential and photophysical properties in an emissive RNA alphabet has shown that the replacement of sulfur with selenium in derived nucleobases has a minimal effect on the geometries and energies of base pairs in the classical Watson-Crick model, while base pairs in the Hoogsteen model are destabilized, compared with natural pairs (Chawla et al., 2018).
The concept of the present work includes a statistical analysis of the interactions of S and Se sites from Cys and Met residues and their Se derivatives (Sec and Mse) in crystal structures retrieved from the CSD. Three model systems were used to search the CSD and examine whether the replacement of an S with an Se atom provides a significant change in the number and variety of certain types of interactions with surrounding species. Density functional theory (DFT) calculations on hydrogen bonds between CH 4 , NH 3 or H 2 O molecules as donors for hydrogen bonding, and CH 3 SeCH 3 and CH 3 SeH molecules with both donor and acceptor roles, have been carried out. We also investigated the parallel interaction and -type interactions between CH 3 SeCH 3 and CH 3 SeH molecules. Finally, we assessed the interactions of S and Se analogues with biochemical systems based on results from molecular docking, in order to determine whether our prediction, based on the CSD results and our calculations, fits the interpretation for binding of compounds that contain Se and/or S atoms to biomolecules.

Methodology
The crystallographic study is based on structures archived in the Cambridge Structural Database (CSD, Version 5.36; Groom et al., 2016). A particular structure was considered a 'hit' if the distance between the Se and A (any atom) was less than 4.5 Å (for Se interactions) or the distance between the S and A (any atom) was less than 4.0 Å (for S interactions) (Fig. 1). This distance was denoted the d parameter.
The program ConQuest (Version 1.10; Bruno et al., 2002) was used to retrieve all structures from the CSD satisfying the following criteria: (i) error-free coordinates according to the criteria used in the CSD, (ii) not disordered structures, (iii) H-atom positions normalized using the CSD default bond lengths, (iv) no powder structures, (v) no polymer structures, (vi) determined 3D coordinates, and (vii) a crystallographic R factor less than 0.10.
All quantum-chemical calculations were performed using the GAUSSIAN09 program (Frisch et al., 2013). The geometries of isolated methaneselenol (CH 3 SeH), dimethylselenide (CH 3 SeCH 3 ), water, ammonia and methane molecules were optimized and further used to calculate interaction energies in all dimer model systems. Surface investigations were performed on the MP2/aug-cc-pVDZ level. The CCSD(T)/ CBS method offered information on interaction energies and limits for calculated minima. The CCSD(T)/CBS interaction energies were evaluated using the extrapolation scheme of Makie & DiLabio (2011), including calculations at the MP2/ aug-cc-pVDZ, MP2/aug-cc-pVTZ, MP2/aug-cc-pVQZ and CCSD(T)/aug-cc-pVDZ levels. The basis set superposition error (BSSE) was corrected for all calculated energies based on the well known counterpoise method (Boys & Bernardi, 1970).
By comparing selected computational levels with the CCSD(T)/CBS energy values, the desired accuracy can be achieved. However, the choice of the best method (the best match to calculated limits and the most time-saving) for a particular property is realized through a rigorous benchmark study of a broad set of quantum-chemical methods and basis sets, along with the LC-WPBE, TPSS, TPSS-D3, M062X-D3, CAM-B3LYP-D3, CAM-B3LYP-D3BJ, WB97XD and PBE0 functionals. The basis sets aug-cc-pVDZ, aug-cc-pVTZ and def2-TZVP were tested for each functional. The calculations were also performed for interactions between two molecules of CH 3 SeCH 3 and CH 3 SeH along the C-Se and H-Se bonds.
The molecular electrostatic potential surface (MEPS) has been shown to be an effective method for predicting noncovalent interactions, helping us to understand them better in our model systems. As a preliminary study, MEPS analysis was performed for both molecules, CH 3 SeCH 3 and CH 3 SeH. Using GAUSSIAN09, the MP2 method and the aug-cc-pVQZ basis set to calculate wavefunctions, we were able to investigate electrostatic potential maps for both molecules.
All enzyme structures used in the docking studies were extracted from the Protein Data Bank (PDB; Berman et al., 2000). In the subsequent step of enzyme preparation from that structure, all water molecules and ligands were removed. The compounds of S and Se, treated as ligands for binding to biomolecules, were optimized at the B3LYP/aug-cc-pVDZ level using GAUSSIAN09. The AutoDockTools program was applied to the process of docking preparation, while the docking study was achieved with the AutoDock Vina program (Trott & Olson, 2010). All enzyme residues were kept rigid during the calculation, while the single bonds of the ligands were set to rotate. The whole enzyme was placed in a grid box large enough to accommodate the ligands and allowing it to move freely during the docking run.
According to the father of 'supramolecular chemistry' J. M. Lehn (Lehn, 1994), it is defined as the chemistry of the intermolecular bond, covering the structures and functions of the entities formed by the association of two or more chemical species. In addition to that, it is predominantly based upon tailor-made intermolecular interactions. Biological systems offer valuable inspiration and motivation that guide supramolecular research. The wide spectrum of scientific research in supramolecular chemistry is strongly supported by crystal structure data. The CSD contains data for more than 1 000 000 crystal structures and provides good insight into inter-  The model systems and geometric parameter d of (a) geometry 1, (b) geometry 2 and (c) geometry 3, used for searching the CSD. molecular interactions and information, conformational preferences, drug design and delivery. One other productive database we have used in our research is the PDB. This is the global archive of 3D structural models for many biological macromolecules and their complexes. Most structures in this database were determined using X-ray crystallography, NMR or 3D cryo-electron microscopy (cryo-EM). Our docking results and accompanying information on binding energies were obtained on protein structures derived from the PDB.

CSD analysis of Se and S interactions
Our search of the CSD yielded 552 structures for geometry 1, 1507 structures that correspond to geometry 2 and 2563 structures for geometry 3 (Fig. 1). After simple elimination of all contacts that had Se/SÁ Á ÁX, Se/SÁ Á ÁY and Se/SÁ Á ÁZ distances shorter than the Se/SÁ Á ÁA distance, only 7714 contacts remained in the search results for geometry 1, 829 contacts for geometry 2 and 166 contacts for geometry 3 (Table 1). All parameter distributions used to describe the geometry of the analysed contacts are presented in Fig. 2. The distribution of the parameter d for Se interactions reveals that all contacts from geometry 1 appear in the lower values compared with the other two geometries. The contacts of geometries 2 and 3 show distribution maxima between 3.5 and 3.9 Å , but the contacts of geometry 1 do not show a similar strong tendency to have parameter d in a narrow range of values.
Different trends are observed for the distribution of angles in the P 1 plane (formed by a C-Se-R fragment) with vector V 1 (X-A vector) for geometry 1, in the P 2 plane (formed by an X-A-Y fragment) for geometry 2 or in the P 3 plane (formed by atoms X, Y and Z) for geometry 3. Although the distribution of the d parameter is very similar in geometries 2 and 3, their angle distributions are quite different. Angles between 0 and 20 are attributed to parallel orientation, angles between 70 and 90 are connected with orthogonal orientation, and all remaining angle values (between 20 and 70 ) are attributed to slope orientation. The angle distributions (Fig. 2) reveal a strong tendency of all three groups to slope orientations. Additionally, orthogonal orientations are more abundant than parallel ones in the first and third geometries. The contacts from geometry 2 have an equal contribution from both orientations ($25%).
All important differences in the distances and mutual orientations of all fragments that interact are connected to the natures of those interactions. When the influence of the nature of an atom in position A is considered (Table 1), C-HÁ Á ÁSe interactions become the most numerous in group 1 (2859 Csp 2 -HÁ Á ÁSe contacts, 2294 Csp 3 -HÁ Á ÁSe contacts and 759 Csp-HÁ Á ÁSe contacts). Additionally, the most abundant interactions in group 2 are those between two Se atoms (432 SeÁ Á ÁSe contacts) and SeÁ Á ÁS (163 contacts), while SeÁ Á Á interactions are observed in group 3 (148 contacts). Compared with these, interactions (contacts with a C-SeÁ Á ÁA, R-SeÁ Á ÁA, X-AÁ Á ÁSe, Y-AÁ Á ÁSe or Z-AÁ Á ÁSe angle higher than 160 ) are rare: 168 interactions in the first group, 182 in the second and only 28 interactions in the third.
Based on statistical results, selenium has a tendency to build hydrogen bonds of the C-HÁ Á ÁSe type. These structures are 150 times more numerous than those where classical hydrogen-bond donors (O-H and N-H groups) interact with Se (see the supporting information). When all contacts where the SeÁ Á ÁX, SeÁ Á ÁY or SeÁ Á ÁZ distances are shorter than the SeÁ Á ÁA distance are eliminated, the main database of 2563 structures is reduced to 166 contacts. This obviously indicates that the Se atom prefers to participate in interactions with aryl C-H groups in C-HÁ Á ÁSe interactions rather than with systems in SeÁ Á Á interactions.
Although our goal was to gain insight into the supramolecular structure for the Cys and Met side residues and their Se derivatives Sec and Mse, we did not restrict the atoms in the R position during the CSD search. In this way we have obtained a more complete view of the substitution of S with Se as a function of the various substituents at the R position. Structures with Csp 2 and Csp 3 atoms in position R form the largest group of contacts (Table 1). An interesting result is the abundance of structures with an Se-Se bond (782 contacts). There are 608 contacts with a metal ion M in the R position and 53 contacts with an H atom in the R position. A statistical analysis of all contacts with an H atom in the R position (corresponding to a Sec side chain) or a Csp 3 atom (corresponding to an Mse side chain) indicated that C-HÁ Á ÁSe interactions are many times more numerous than the other types of interaction. The second most common type of inter-  Table 1 The results of our statistical analysis of selenium and sulfur interactions in crystal structures. action is the SeÁ Á ÁSe interaction. The number and distribution of certain types of interaction in these two groups of contacts are consistent with the overall trend for all contacts from our CSD search. The most numerous are the structures in which a Csp 2 or Csp 3 atom is bound to sulfur in the R position. Less numerous are structures with a metal ion M (14 563 contacts) or S (3728 contacts) in the R position. Structures with an H atom in the R position (2140 contacts) were also found. Similar to selenium, these structures including S and H atoms (corresponding to a Cys side chain) and a Csp 3 atom (corresponding to a Met side chain) in the R position were analysed separately. A statistical analysis showed that C-HÁ Á ÁS interactions are drastically more numerous than the others. For structures with a frag-ment corresponding to a Met side chain, the SÁ Á ÁO interactions (1855 contacts) and SÁ Á ÁS interactions (1082 contacts) are the second most abundant type of interaction. In the case of a Cys fragment, the second most common are the SÁ Á ÁO contacts, as a consequence of the S-HÁ Á ÁO interactions (285 contacts). Interestingly, the number of structures with electronegative atoms (O, N or a halogen element) in the R position is negligible. This suggests that the view of the supramolecular structures we have obtained here refers to structures containing nonpolar groups bonded to an S or Se atom.
The results obtained by a statistical analysis of geometric parameters can be illustrated by analysing packages in the crystal structure of [(4-methoxybenzyl)selanyl] acetic acid research papers (refcode EMOCEP, Fig. 3). In this structure, the considered molecule contains Se, carboxyl, aryl and methoxy groups. In this crystal structure the main synthon is a dimer that contains bifurcated hydrogen bonds formed by carboxyl groups (red dashed lines in Fig. 3). These dimers are the main components of lamellae, where each molecule of acid interacts differently with the ones under and above it. In one of these orientations, the Se atom participates in a C-HÁ Á ÁSe bond with the aryl group of an adjacent acid molecule (the length of this bond is 3.15 Å ). In another orientation, acid molecules form interactions that include Se. The acid groups simultaneously form SeÁ Á ÁSe interactions (length 3.70 Å ) and C-HÁ Á ÁSe interactions with an aryl group (length 3.03 Å ).
When SeÁ Á ÁSe interactions are observed, the angle between two contact C-Se-C planes is 62.3 . Based on this example, it can be concluded that the Se atom prefers to participate in interactions with C-H groups rather than aryl or O-H groups, which are reserved to formaromatic interactions and classical hydrogen bonds, respectively. It is also very interesting that the Se atom does not interact with alkyl C-H groups, even if they are more numerous than the aryl C-H groups.
3.1.1. SeÁ Á ÁSe interactions. These have been treated as a sub-subsection until now and their geometry has not been described in detail. These interactions belong to the second group of contacts. The length distribution of SeÁ Á ÁSe interactions is similar to that of contacts retrieved from the CSD search of geometry 2 (the majority of interaction lengths are between 3.5 and 3.9 Å ). The main geometric difference is connected with the distribution of the P 1 /P 2 angle (Fig. 4). This distribution shows that SeÁ Á ÁSe contacts prefer to form interactions with parallel orientations of contact R-Se-C planes.
A typical example of a parallel interaction is the crystal structure with refcode AKANAB (Fig. 5). This structure contains bis(ethylenediseleno)tetrathiafulvalene molecules, where the five-membered ring with two S atoms is condensed with a six-membered ring with two Se atoms. Fulvalene derivatives form lamellae where two different types of packing are found. In one packing, the six-membered rings of neighbouring fulvalenes are in contact through bifurcated C-HÁ Á ÁSe interactions (lengths of 2.96 and 3.16 Å ). In the second packing, the six-membered rings are placed above the fivemembered rings, forming multiple SeÁ Á ÁS interactions (lengths between 3.7 and 3.9 Å ). Fulvalene derivatives from adjacent lamellae simultaneously form SeÁ Á ÁSe and SeÁ Á ÁS interactions. In these interactions all the SeÁ Á ÁSe distances (3.54 and 3.62 Å ) are shorter than the SeÁ Á ÁS ones (3.76 and 4.04 Å ). All these results point to stronger interactions between Se and another Se atom compared with an S atom. Supporting this, the number of SeÁ Á ÁSe contacts is several times higher than the number of SeÁ Á ÁS contacts in the CSD (Table 1).

Quantum-chemical calculations
Our analysis of statistical data on all contacts has pointed to a greater preference of Se and S atoms for hydrophobic regions in order to participate in C-HÁ Á ÁSe and C-HÁ Á ÁS interactions. The substitution of S with Se does not bring a major change in the supramolecular structure of its deriva- The crystal packing of [(4-methoxybenzyl)selanyl] acetic acid, illustrating the interactions of selenium (red dashed lines) . Distances are in Å .

Figure 4
Distributions of (a) parameter d and (b) the P 1 /P 2 angle, used for describing SeÁ Á ÁSe interactions in crystal structures extracted from the CSD.
tives. The main difference could be expected to be in the strength of their interactions. In some cases, this strength plays a crucial role in the chemistry and function of their compounds. The strengths of hydrogen, parallel and sigma interactions were calculated for both molecules, CH 3 SeCH 3 and CH 3 SeH. These two molecules represent models for Se derivatives of Cys and Met residues.
3.2.1. Electrostatic potential maps. Se atoms have a region of negative and a region of positive potential (Fig. 6), enabling electrostatic attraction between Se atoms. Similar behaviour is observed with S atoms (Antonijević et al., 2016). Red areas in the electrostatic potential maps indicate an increase in the electron density, while a positive potential is denoted by blue regions.
As can be seen from the maps, there is a positive electrostatic potential area ( region) at the end of the Se atom along the C-Se and H-Se bond vectors of both molecules, which can act as electron acceptors (yellow and green regions). The most positive potentials are localized on the H atoms. In addition, the maps show a negative region above the Se atom ( hole) in the direction of the free electron pairs. This negative region on one CH 3 SeCH 3 or CH 3 SeH molecule can be paired with a more positive region on the Se atom of the other molecule through --hole interactions. The MEPS results show that electrostatic interaction could play an important role in the formation of theseinteractions.

Hydrogen bonding.
An evaluation of the strength of Se hydrogen bonds was done on eight model systems, numbered HB1-HB8 (Fig. S1 in the supporting information).
In the first six model systems (Table 2), CH 4 , NH 3 or H 2 O molecules were used as donors of H atoms, while CH 3 SeCH 3 or CH 3 SeH molecules were used as hydrogen acceptors. For each model system (Fig. 7)    Electrostatic potential maps of (a) CH 3 SeCH 3 and (b) CH 3 SeH.
For calculations on the HB1 model system, acceptable results are only given with the TPSS-D3 and TPSS-D3BJ functionals. If we expand the criterion for accuracy of calculations to 20% of the CCSD(T)/CBS energy, then all the functionals are acceptable for this model system, except for those at the TPSS, M062X-D3 or PBE0 levels. Similar results are obtained for the HB2 model system, while for the HB3 model system, in addition to those three mentioned functionals, WB97XD should also be strongly avoided. For calculations based on the HB4 model system, acceptable results are only given with the TPSS-D3 and TPSS-D3BJ functionals.
For the HB5 and HB6 model systems all the functionals are accepted, except for the TPSS, M062X-D3, WB97XD and PBE0 functionals. The best functionals for the HB7 and HB8 model systems are TPSS-D3 and, surprisingly, TPSS, while acceptable results were also obtained with the CAM-B3LYP-D3 and CAM-B3LYP-D3BJ functionals. Functionals PBE0 and M062X-D3 should be avoided because they yielded no accurate results. The interaction energies for the HB7 and HB8 model systems are slightly higher [ÁE CCSD(T)/CBS = À2.08 and À2.47 kcal mol À1 , respectively; 1 kcal mol À1 = 4.184 kJ mol À1 ] than the energies for the corresponding systems with an NH 3 molecule as acceptor [ÁE CCSD(T)/CBS = À1.85 and À2.15 kcal mol À1 , respectively], indicating that CH 3 SeH is a better H-atom donor than NH 3 .
Based on the given criteria one can conclude that the TPSS and M062X-D3 functionals should not be used for most of the systems [except for HB7 and HB8, where TPSS provides strong agreement with the CCSD(T)/CBS values]. Also, it can be noted that significantly fewer functionals can be used to calculate weak interactions (in the HB1 and HB4 model systems) than moderate (HB2 and HB5 model systems) or strong interactions (HB3 and HB6 model systems  Table 2 Energies of the Se hydrogen bonds HB1-HB8 (in kcal mol À1 ), calculated in the procedure for evaluating CCSD(T)/CBS limits and obtained in the benchmark studies. respectively]. The calculations also show that CH 3 SeCH 3 is a better acceptor of hydrogen bonding than CH 3 SeH. All these calculations offered us the possibility of also using MP2/aug-cc-pVTZ and MP2/aug-cc-pVQZ because of their good agreement with the CCSD(T)/CBS level for the majority of HB models. However, these levels of calculation are time costly so we do not recommend them.
3.2.3. Parallel interactions. The results of our statistical analysis of crystal structures indicate that SeÁ Á ÁSe interactions with parallel orientation of the C-Se-R planes are the most numerous. The calculations were performed on five model systems (Fig. 8): three model systems with methane-selenol dimers (P1, P2 and P3 model systems) and two model systems with dimethylselenide dimers (P4 and P5 model systems). Scheme 1 shows the parameters describing the geometries of parallel interactions in the studied Se derivatives, and the directions of molecular movement during the examination of the strength of the parallel interactions. In the P1 and P4 model systems, the Se atom of the first molecule is moved along direction 1 containing the height of the C-Se-X triangle (X = C or H) of the second molecule (Scheme 1). In other systems, the Se atom is moved along direction 2 passing through the centre of the Se-H (P3 model system) or Se-C bond (P2 and P5 model systems) (Scheme 1). The starting point for moving is the Se atom of the P1 and P4 model systems, the centre of the Se-C bond for the P2 and P5 model systems, or the centre of the Se-hydrogen bond for the   Table 3 Interaction energy values (in kcal mol À1 ) and geometric parameters (in Å ) for the most stable parallel andinteractions.  The model systems of methaneselenol dimers (P1, P2 and P3 model systems) and dimethyl selenide dimers (P4 and P5 model systems) for parallel interactions.
. Our calculations have shown that the movement of parallel fragments from the starting point leads to a reduction in interaction energies. Orientations with the Se atom above the centre of the C-Se bond (P2 and P5 model systems) are more stable than those of the corresponding species with the Se atom above the centre of the Se-H bond (P3 model system). The least stable orientations (P1 and P4 model systems) have the Se atom located above the Se atom of the second molecule (Table 3). The energy of the most stable parallel orientation of two dimethylselenide molecules is considerably greater [ÁE CCSD(T)/CBS = À3.27 kcal mol À1 ] than the energy of the most stable parallel interaction between two methaneselenols [ÁE CCSD(T)/CBS = À2.21 kcal mol À1 ]. The results of these calculations show that the substitution of an H atom by an alkyl group leads to a strengthening of SeÁ Á ÁSe interactions with parallel orientation.
The benchmark study revealed that the TPSS-D3 and CAM-B3LYP-D3BJ functionals are suitable to obtain a good match with calculated energy values at the CCSD(T)/CBS level (Table 3). To get excellent matches, a useful method is TPSS-D3BJ, while the rest are not recommended.
The analysis of the interactions of cysteine residues (R-CH 2 SH) in the crystal structures from the CSD showed that SÁ Á ÁS interactions with a parallel orientation of the C-S-H planes are the most numerous (Antonijević et al., 2016). Quantum-chemical calculations at the CCSD(T)/CBS level were performed on model systems of methanethiol dimers with parallel orientations and with starting geometries similar to the geometries presented here. In the most stable geometry of a methanethiol dimer with parallel orientation [ÁE CCSD(T)/CBS = À1.80 kcal mol À1 ], the S atom is located above the centre of the S-C bond (Antonijević et al., 2016). Based on these findings and on our calculations, it can be concluded that the replacement of an S atom with Se leads to an increase in the strength of parallel interactions.
3.2.4. r-p-hole interactions. Model systems used for investigatinginteractions (electrostatic models, ES) are based on dimers of CH 3 SeCH 3 or CH 3 SeH (Fig. 9). The main parameter used to describe the geometries of these interactions is the distance between the Se atoms (d). The angle between the C-Se vector (or H-Se vector) and the C-Se-X plane of the second molecule (X = C or H) is constant and takes a value of 62.3 . During the calculation the distance d was changed, while the value of the C-SeÁ Á ÁSe angle (ES1 and ES3 model systems) or H-SeÁ Á ÁSe angle (ES2 model system) was kept at 180 . Interaction energies are calculated for all geometries with parameter d between 3.4 and 4.5 Å . Results for different values of parameter d are offered as part of the supporting information, Fig. S3.
The interaction energies calculated at the CCSD (T)/CBS level for each model system minimum are presented in Table 3. The whole benchmark study for three model systems (ES1, ES2 and ES3) is also presented in Table 3. The results of quantum-chemical calculations show that the --hole (SeÁ Á ÁSe) interaction is strongest in the case of the model system with two CH 3 SeCH 3 molecules [ES3, ÁE CCSD(T) = À2.55 kcal mol À1 ]. This interaction also has the shortest distance between Se atoms (d = 3.8 Å ); this distance corresponds to the sum of the Se van der Waals radii (1.9 Å ). Systems with CH 3 SeH molecules (ES1 and ES2) have weaker interactions [ÁE CCSD(T) = À1.51 and À2.06 kcal mol À1 , respectively].
The benchmark study revealed the functionals that are suitable to obtain good matches with the calculated CCSD(T)/ CBS limits (the TPS and PBE0 functionals). To get very good matches, useful ones are CAM-B3LYP-D3BJ and MP2, while the rest should be avoided. In addition, the more positive effect of including empirical dispersions (D3 and D3BJ) on the matching with limits can not be denied.

Docking study
The effect of substitution of S by Se on interactions with biosystems will be illustrated through several examples, and the obtained results will be analysed by molecular docking. Molecular docking is a very important tool when it comes to drug discovery and design. It is used to model the interactions between a small molecule and a protein at the atomic level, and it allows us to characterize the behaviour of small molecules on the binding sites of target proteins. It consists of prediction of the ligand's conformation, position and orientation within these sites and assessment of the binding affinity. The model systems of methaneselenol dimers (ES1 and ES2 model systems) and dimethyl selenide dimer (ES3 model system) for --hole interactions.
One important advantage of molecular docking is the possibility of interpreting the substitution of S by Se in all those cases where there are no direct interactions between amino acid residues and those two atoms. Rather, the interactions are indirect, via groups attached to them. When a ligand binds to a biomolecule, it can locate in such a manner that the S and Se atoms are oriented away from the biomolecule and therefore not participating in interactions with amino acid residues. In contrast, in crystal structures generally, crystal packing rules dictate that all the fragments present should participate in interactions with different species from the surrounding environment, although there is a possibility that fragments form only intramolecular interactions and approaching species in the wider environment are sterically hindered.
Validation of the docking procedure by the AutoDock Vina program was confirmed by redocking of ITU (iodotubercidin) at the iNOS enzyme (inducible nitric oxide synthase), extracted from the crystal structure with PDB code 4nos (Fischmann et al., 1999). More details are available in the supporting information (Fig. S4). The docking study indicates that ITU (a well known inhibitor for 4nos) binds on the same site as in the crystal structure (Fig. S4), in the narrow cleft inside the larger active-site cavity that also contains a haem complex. The cleaned enzyme structure was further docked with four different compounds, corresponding to selenium and sulfur analogues (PBIT, PBISe, PEIT and PEISe, Fig. 10). The statistical analysis indicated that crystal structures with a Csp 2 atom in the R position are the most numerous (Table 1). This residue is also present in the above-mentioned structures, implying their good fitting qualities for the chosen model for our CSD search. PBIT is a well known inhibitor for iNOS, implicated as a potential therapeutic target in malignant melanoma, the most deadly form of skin cancer (Madhunapantula et al., 2008). However, PBIT requires high concentrations for clinical efficacy and at the same time causes systemic toxicity. PBISe is its selenium derivative and a more potent agent, effective at significantly lower concentrations (Madhunapantula et al., 2008). The monosubstituted forms of PBIT and PBISe (PEIT and PEISe) were also examined (Madhunapantula et al., 2008).
Our results from this docking study suggested that PBISe could be the best targeting inhibitor for the active site of iNOS (Fig. 11). This is in strong agreement with experimental results showing that PBISe kills melanoma cells more than ten times more effectively than other iNOS inhibitors (PBIT, PEISe and PEIT) (Madhunapantula et al., 2008). All obtained orientations of PBISe are clustered at the same site (active site, position 1, Fig. 11), leading to the much better efficiency of this molecule compared with PBIT, PEISe and PEIT where numerous binding sites can be observed.
The influence of the nature of surrounding amino acids was investigated by analysing the most stable orientation of each ligand (PEISe, PEIT, PBISe and PBIT) for all binding sites. In addition, all observed orientations with PEISe and PBISe (containing Se atoms) were considered as one group and those with PEIT and PBIT (containing S atoms) as the other. From the results of this statistical analysis (Figs. S5-S15) it is evident that the amino acid environment of the presented compounds on their binding sites is non-polar in both cases and independent of the type of atom (sulfur or selenium). By comparison, all interactions with negative amino acids are less abundant. We observed a significantly larger number of these interactions with clusters that contain Se atoms compared with those with S atoms (the same tendency as observed with nonpolar acids). One possible explanation for this trend is the larger radius of selenium compared with sulfur, leading to better electron correlation for contacts of Se derivatives with negative amino acid residues. The analysed compounds achieved the fewest contacts with polar and positive amino acid residues. In all the investigated systems there is no clear evidence of a direct interaction between an amino acid residue and an S or Se atom. Differences in the binding affinities of the The chemical structures of the active S and Se derivatives, PBIT, PBISe, PEIT and PEISe, used for the docking study of the iNOS enzyme (Madhunapantula et al., 2008).

Figure 11
The binding sites and corresponding binding energies of (a) PBISe, (b) PBIT, (c) PEISe and (d) PEIT on the iNOS enzyme. contain S and/or Se, even when there are no direct interactions of these atoms with amino acid residues (Figs. S5-S15). The investigation of this effect cannot be conducted based on crystal structure analysis, since these atoms in the crystal structures have interacted with some chemical species in their vicinity, as a consequence of the crystal packing. Docking allows the binding of molecules and biosystems, but not all fragments of a molecule may be involved in interactions with the biomolecule.
Similar results were obtained for docking studies that involved eurothiocin A and its Se derivative (Se-eurothiocin A) which, according to the structures of their S and Se centres, are similar to the Met and Mse residues. Specifically, it was shown that the sulfur-containing benzofuran derivative (eurothiocin A), exhibits more potent -glucosidase inhibitory effects than the clinical inhibitor acarbose (Liu et al., 2014). Eurothiocin A and Se-eurothiocin A are docked at the structure of -glucosidase. The results obtained by this docking study have shown that both compounds have a tendency to bind on the enzyme active site (located near the catalytic residues Glu277 and Asp352, Fig. 12). Eurothiocin A has a slightly lower binding energy (À7.3 kcal mol À1 ) than the corresponding Se derivative (À7.4 kcal mol À1 ). As neither S nor Se atoms participate directly in interactions with amino acid residues on the most stable binding site (Fig. 12), the main differences in binding energies could also be attributed to electron-induced effects. The Se derivative also interacts with a larger number of negative amino acid residues compared with the S derivative, as in the previous system.
To examine whether the substitution of S by Se in drugs which contain a fragment that corresponds to the side chains of Cys or Sec has any effect on the supramolecular structures of these derivatives with biosystems, docking studies of captopril and its Se derivative (Se-captopril) on angiotensin I-converting enzyme (ACE) were undertaken. ACE is a zincdependent dipeptidyl carboxypeptidase, and a well known target for the treatment of hypertension and related cardiovascular diseases. Captopril is a well known clinical inhibitor of ACE, but it is known that the selenium analogue (Secaptopril) also inhibits ACE . These two compounds have an SH or SeH group (monosubstituted S/Se atom) bonded to a non-polar chain (Fig. 13). This is probably the reason why, in this case, the S or Se atom interacts with an amino acid residue (His367) and a Zn 2+ ion from the active site of the enzyme (Fig. S16). In the most stable orientations, captopril and Se-captopril bind at the active site of ACE, in an analogous fashion to that observed in the crystal structures of the captopril/ACE (PDB code 2x8z; Akif et al., 2010) and Se-captopril/ACE complexes (PDB code 3zqz; Akif et al., 2011). These two structures present the zinc coordinating to the sulfur-and selenium-based ACE complexes, with significant antioxidant activity. Se-captopril is an excellent inhibitor of ACE, although its IC 50 value is twice as high as that for captopril . The results of the docking studies have shown that these two compounds bind at the active site of the enzyme with similar binding energies (À5.8 and À5.7 kcal mol À1 ), indicating that the coordination of these compounds is a decisive factor in their different activities. Although the formation of a selenol-Zn complex is slightly favoured over the formation of a thiolate-Zn complex (Pearson, 1990;Salter et al., 2005), the possible reason for the relatively high IC 50 value of Secaptopril is partial oxidation of the selenol to the corresponding diselenide derivative. This is further supported by the assumption that diselenides do not inhibit enzyme activity  and by the results of molecular docking which showed that the disulfide and diselenide derivatives of captopril and Se-captopril are bonded on the active site of ACE (Fig. S17), although in this case a carboxyl group is in contact with a Zn 2+ ion, but not an S or Se atom.
The docking results indicate that the Cys and Sec fragments of the above-mentioned drugs have a tendency to interact with the same amino acid residues of the ACE enzyme (Fig. S16). This is consistent with the results of our CSD analysis, which showed that these two fragments have a similar tendency for certain noncovalent interactions. The binding sites of (a) eurothiocin A and (b) Se-eurothiocin A on -glucosidase (PDB code 3aj7; Yamamoto et al., 2010), and the distributions of amino acid residues for the most stable orientations.

Figure 13
The binding sites and binding energies of (a) captopril and (b) Secaptopril on the active site of angiotensin I-converting enzyme (ACE) (PDB code 2x8z; Bhuyan & Mugesh, 2011).

Conclusions
Based on statistical data analysis of a large number of crystal structures, sulfur and selenium display similar tendencies towards specified interactions. Our crystallographic analysis reveals that the most numerous are structures with C-HÁ Á ÁSe and C-HÁ Á ÁS interactions ($80%). Notably fewer analysed structures exhibit SeÁ Á ÁSe and SÁ Á ÁS interactions ($5%), while SeÁ Á Á and SÁ Á Á interactions are the least numerous.
These results also indicate that both C-HÁ Á ÁSe and C-HÁ Á ÁS interactions are weaker than parallel, SeÁ Á Á (Senćanski et al., 2017) and electrostatic --type interactions. Although C-HÁ Á ÁSe and C-HÁ Á ÁS interactions are weaker than the rest, their significant presence can be explained because they accommodate a very large number of CH groups compared with the numbers of Se and S atoms. In addition, all substituents bonded with Se and S atoms influence the crystal packing of these molecules and have a tendency to reduce possibilities for SeÁ Á ÁSe and SÁ Á ÁS contacts. This can also explain why O-HÁ Á ÁSe ($ À4.4 kcal mol À1 ) and N-HÁ Á ÁSe interactions ($ À2.2 kcal mol À1 ) are less present than C-HÁ Á ÁSe interactions ($ À0.8 kcal mol À1 ).
The docking results revealed that both S and Se atoms rarely participate in interactions with amino acid residues, mostly because those residues preferentially interact with groups that are bonded to Se or S atoms. Also, in docking, all ligands are bonded on the surface and they are not completely surrounded by amino acids. That is another reason why they do not participate in the mentioned interactions. However, crystal packing dictates demands the densest packing of all species in a crystal. This is the main explanation of why interactions with sulfur and selenium are so numerous in crystal structures. Our statistical results reveal that, in the group of 769 structures that contain Se in our CSD search fragment (Fig. 1), interactions are present in 552 structures. A similar tendency is present taking an S atom into account: in the group of 19 355 structures with this search fragment in the CSD, interactions are present in 4090.
Analysing all groups bonded to an S or Se atom in the ligands and the docking results, there is no meaningful difference in the bonding of their derivatives if S or Se atoms are substituted by non-polar residues. This is the main reason why the two ligands eurothiocin A and Se-eurothiocin A bind to the active site in -glucosidase with similar binding energies (À7.3 and À7.4 kcal mol À1 ). All collected results were connected with data obtained from the CSD. There is no difference in trends for the sulfur and selenium compounds. In the CSD search fragments, sulfur or selenium are bonded to CH 2 and R groups. In the R position only Csp 3 and sp 2 atoms are mainly found (non-polar groups). It is evident that the search results only offer a better insight into compounds that bind only non-polar residues. Sulfur and selenium derivatives of N,N-diethylcarbamylcholine have shown similar and only marginal inhibitory activity towards electric eel acetylcholinesterase (0.2-0.3 mM) (Lindgren et al., 1985). Residues directly bonded to an S or Se atom in this case (alkyl and carbonyl group) do not have the ability to make classical hydrogen bonds as an H-atom donor (Fig. S18). The nature of directly bonded residues at carbamyl compounds (Se and S derivatives of 1-naphthyl-N-methyl carbamates, Fig. S18) is similar to previous carbamyl derivatives, which is probably the reason why these two derivatives have similar inhibitory activity towards electric eel acetylcholinesterase (0.2-0.3 mM) (Lindgren et al., 1985).
It is important to point out that the top docking position is conditioned by several different criteria, among them the lowest binding energy and the number of ligand interactions with other sites of the enzyme. If polar groups, suitable for interaction with other sites of the enzyme (possible donors of a hydrogen bond), are bonded to an S or Se atom in the R position, the number and positions of possible binding sites and their binding energies could vary. This difference is more pronounced if there is more than one S or Se atom in the ligand. Both PEIT and PEISe have only one S and one Se atom, respectively, substituted with one non-polar and one polar (amidine) group (Fig. 10). This amidine group has the possibility of being a donor for hydrogen bonding. PEIT has four binding sites for iNOS, while its selenium derivative PEISe has only three, but the latter's binding energy to the active site is slightly higher compared with PEIT (À8.5 kcal mol À1 for PEISe and À8.1 kcal mol À1 for PEIT). As a ligand with two Se atoms, PBISe has only one binding site, while its sulfur derivative PBIT has three. This is the major reason why PBISe is the best inhibitor for the iNOS enzyme (only one binding site and a notable binding energy compared with all the others).
The theoretical conclusions from our docking study on the iNOS enzyme have shown that inhibitory activity rises with increasing number of S atoms, and that the replacement of S with Se also leads to an increase in inhibitory activity. Although the mechanism of inhibition of iNOS is based on non-covalent interactions, this effect is observed for inhibition in which the mechanism is based on the reaction of corresponding S and Se inhibitors. In particular, one in vitro study considered the capacity of S-and Se-modified phenols from hydroxytyrosol to inhibit lipid peroxidation (Rodríguez-Gutié rrez et al., 2019). The sulfur compound with the strongest ability to inhibit lipid peroxidation (Fig. S19, compound 2) at the lowest concentration is the disulfide hydroxytyrosol derivative (percentage inhibition 50.78%). This derivative is more than twice as good an inhibitor as a similar compound with only one S atom (Fig. S19, compound 1, percentage inhibition 21.15%). Even better results were obtained in hydroxytyrosol, where two S atoms are replaced with Se atoms (Fig. S19, compound 3, percentage inhibition 80.95%). The study of the antioxidant effects of sulfur and selenium derivatives against disease is focused on several mechanisms, including radical scavenging, enzymes and metal-binding antioxidant mechanisms (Battin & Brumaghim, 2009). Therefore, the activity of these compounds will largely depend on their molecular structure, because that determines which of the abovementioned mechanisms will have the greatest influence on their activity. In the case of lipid peroxidation, the inhibition by S and Se compounds is most likely attributed to their non-research papers enzymatic antioxidant capacity. Given how molecular recognition and supramolecular structure have an important role in this, it is clear why conclusions about the activities of compounds, obtained from non-covalent models, can be applied to the understanding of reaction mechanisms.
There is one group of derivatives (monosubstituted derivatives) that express little possibility of interaction between S or Se atoms and amino acid residues. A typical example of this group is the binding of captopril and Se-captopril to the active site of angiotensin I-converting enzyme (ACE). These ligands have the same binding site and similar binding energies (À5.8 kcal mol À1 for captopril and À5.7 kcal mol À1 for Secaptopril) as they both possess a non-polar fragment directly bonded to an S or Se atom. However, the activity of these compounds demands the existence of Zn-S and Zn-Se coordination bonds and that leads to a higher chemical affinity for sulfur. This explains why captopril is a better inhibitor than Se-captopril.
To conclude, the replacement of an S with an Se atom does not provide significant changes in molecular structure (defined only by the types of interaction). On the other hand, small changes in the strength of those interactions cannot be ignored. The nature of the group directly bonded to these two atoms can provoke differences in binding to an enzyme or biosystems and leads directly to their different activities.