Wikipedia:WikiProject Chemistry/IRC discussions/22 Jan 2008

From Wikipedia, the free encyclopedia

--- Log opened Tue Jan 22 11:06:15 EST 2008

11:06 <walkerma> Hi, sorry I'm a couple of minutes late! Have you been discussing anything?

11:06 -!- ChemSpiderMan [[email protected]] has joined #wikichem

11:06 <walkerma> Hi Antony

11:07 <+Rifleman_82> hi antony

11:07 <ChemSpiderMan> Hi all

11:07 <+Rifleman_82> just started i guess

11:07 <+Beetstra> Hi guys

11:07 -!- mode/#wikichem [+o Beetstra] by ChanServ

11:07 -!- mode/#wikichem [+v ChemSpiderMan] by Beetstra

11:07 -!- mode/#wikichem [+v walkerma] by Beetstra

11:08 <+walkerma> Beetstra: What does that do?

11:08 <+Rifleman_82> +v give syou a + beside your name

11:09 <+Rifleman_82> when the channel is moderated, only those with @ (ops) and + (voice) can talk

11:09 <+Rifleman_82> the rest can listen but not talk

11:09 <@Beetstra> Nothing special here .. but if I have to moderate the channel because of trolling, then people with 'voice' can still speak, the others that don't have voice can't say anything

11:09 <+dmacks> Wanna kick the bot?

11:09 -!- Netsplit niven.freenode.net <-> irc.freenode.net quits: +Physchim62

11:09 <+Rifleman_82> yes please

11:09 <+Rifleman_82> pc not staying?

11:09 <+walkerma> Thanks! Do you want to moderate this meeting, Beetstra?

11:09 <@Beetstra> CheMoBot quit

11:09 -!- CheMoBot [[email protected]] has quit ["Mayday! Mayday! .. going down!"]

11:10 <+dmacks> netsplit...woooo:(

11:10 <@Beetstra> No, I let that to you ..

11:10 <+Rifleman_82> i'm quite tired, so i'll prolly stay til max 1 am my time

11:10 <+dmacks> I'm logging

11:10 <+Rifleman_82> hey what happened to the last log?

11:11 <+Rifleman_82> i thought we were going to put it up somewhere?

11:11 <+dmacks> I may have a copy, can't remember who was actually planning to do it.

11:11 <+Rifleman_82> oh

11:11 <+walkerma> I have a log, but I wasn't sure how to distribute it - then the semester started...

11:11 <+Rifleman_82> i've got a copy, which i sent to pc

11:11 <+Rifleman_82> gimme a moemnt

11:11 <+Rifleman_82> i'll upload it

11:11 <+walkerma> Thanks

11:12 <+Rifleman_82> ed not joining us?

11:12 -!- Netsplit over, joins: +Physchim62

11:13 <+walkerma> OK, ChemSpiderMan, could you update us on the database? What still needs to be done?

11:13 <+ChemSpiderMan> I need to finish from P to W

11:14 -!- Physchim62 [n=Physchim@unaffiliated/physchim62] has quit [Read error: 110 (Connection timed out)]

11:14 <+walkerma> Will you be doing that some time next month? Is that the plan?

11:14 <+ChemSpiderMan> Then I need to go through one more time...faster second time...

11:14 <+Rifleman_82> sheesh

11:14 <+ChemSpiderMan> hopefully first week of Feb

11:15 <+walkerma> Great! I just looked at my Sandbox, and it looks like things are progressing there - many of the errors have been fixed.

11:15 <+walkerma> There are a couple of general problems we should probably agree on:

11:16 <+ChemSpiderMan> Second time through checking for some complex natural complex products

11:16 <+ChemSpiderMan> maito-toxin is a bear

11:17 <+walkerma> http://en.wikipedia.org/wiki/Image:Maitotoxin.png

11:17 <+walkerma> Looks more like a snake than a bear to me...

11:17 <+dmacks> ha!

11:18 * dmacks uses that as a teaching example of how "not-simple" an ether can be.

11:18 <+walkerma> Good idea, dmacks! Is it worth treating these "bears" as a separate list? That need more than one person to check them?

11:18 <+ChemSpiderMan> sorry..phone right now

11:18 <+Rifleman_82> okay logs at http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry/IRC_meeting_1

11:19 <+Rifleman_82> it's a mess but i think it's readable

11:19 <+walkerma> Thanks, R82!

11:19 <+Rifleman_82> i'll figure a way to do it nicely and clean it up

11:19 <+Rifleman_82> but i think it'll do for th emoment

11:19 <+Rifleman_82> np martin

11:19 <+dmacks> Yeah, if there are some specific monsters that you want to set aside somewhere, /me can look as time permits.

11:19 <+Rifleman_82> oh...

11:19 <+Rifleman_82> well instaview and wiki doesn't give the same effect

11:19 <+Rifleman_82> i'll try it out while we discuss

11:20 <+walkerma> "Monsters" sounds like a good name for the page! Then we can check these carefully against the primary literature

11:20 <+walkerma> There are some other issues that are general:

11:21 <+walkerma> 1. How do we represent salts? We need a clear policy

11:21 <+walkerma> 2. How do we represent sugars - ring or open chain?

11:21 <+walkerma> 3. How do we address tautomers where both forms are stable

11:22 <+walkerma> Should we discuss these here?

11:22 <+Rifleman_82> what's the problem wiht salts?

11:22 <+walkerma> Often a structure will not have the counterion, but the CAS no does.

11:23 <+walkerma> Or perhaps a drug will be drawn in a neutral form, but the drug is a succinate salt or something

11:23 <+Rifleman_82> icic

11:24 <+walkerma> Perhaps we should say in our new MOS that we require salts to show their counterions - no quat ammoniums without the Cl- or whatever

11:24 <+walkerma> Does this sound reasonable?

11:24 <+Rifleman_82> agree

11:24 <+dmacks> concur

11:24 <+walkerma> http://en.wikipedia.org/wiki/Nile_blue

11:25 <@Beetstra> Hmm .. that gives the problem that you can't discuss the ammonium ion .. or you have to discuss it on every page (chloride, bromide, acetate, nitrate)

11:25 <@Beetstra> I would say .. a compound gets a chembox .. so ammonium chloride

11:26 <@Beetstra> But ammonium ion gets another box .. ionbox e.g.

11:26 <+dmacks> Is "nile blue" really the salt, or is it the imine, which is available as many HX salts?

11:26 <+Rifleman_82> chloride not seen

11:26 <@Beetstra> As I mentioned for functional groups

11:27 <+dmacks> Rifleman_82: Wikipedia:WikiProject Chemistry/IRC meeting 1a ?

11:27 <+walkerma> I'm guessing that it is generally used as the chloride, because that is what the CAS and formula give

11:28 <+ChemSpiderMan> Sorry..I'm back...I think the compound shown needs to be connected to the article name

11:28 <+Rifleman_82> dmacks: ?

11:28 <+dmacks> Okay, so that seems like a simple structure-drawing error.

11:28 <+ChemSpiderMan> The primary key of the article is the compound name..not the structure

11:28 <+dmacks> Rifleman_82: cleaner upload of the log

11:28 <+dmacks> Right, so again are there many possible "nile blue" with different counteranions, or is it specifically Cl- ?

11:29 <+ChemSpiderMan> So, is Nile Blue a chloride salt or not?

11:29 <+ChemSpiderMan> yes..exactly

11:29 <+ChemSpiderMan> Also, INTERNAL consistency between structure, SMILES and CAS

11:29 <+Rifleman_82> dmacks:looks very nice indeed. i'll move and delete mine

11:29 <+Rifleman_82> you're the official secretary henceforth!

11:29 <+ChemSpiderMan> Nile Blue...the structure has no Chloro...the SMILES does.

11:29 <+dmacks> I think we got not-very-far with this discussion last time, what happens when "the name" (wiki page title) does not map to a single compound.

11:30 <+dmacks> Rifleman_82: ok

11:30 <+ChemSpiderMan> Don't know what the CAS is

11:30 <+ChemSpiderMan> what's an example?

11:30 <+ChemSpiderMan> That will help me think about it...

11:30 <+Rifleman_82> betamethasone?

11:30 <+dmacks> Tartaric acid.

11:30 <+Rifleman_82> you have the valerate, and various other esters

11:31 <+Rifleman_82> betamethasone could use a copyedit since we're on it

11:32 <+ChemSpiderMan> betamethasone...is it a trade name for a material or the name of the steroid itself as drawn?

11:33 <+ChemSpiderMan> The way it is shown is that betamethasone is the structure drawn in the box...

11:33 <+Rifleman_82> free acid?

11:33 <+ChemSpiderMan> It says "It is available as a number of esters: Dipropionate (branded as Diprosone, Diprolene and others), Sodium Phosphate and Valerate (branded as Betnovate, Celestone and others)." and I think that covers the rest

11:33 <+Rifleman_82> maybe we stick with dmacks' simpler example

11:33 <+Rifleman_82> for the moment

11:34 <+ChemSpiderMan> tartaric acid...looking

11:35 <+ChemSpiderMan> This looks okay...

11:35 <+ChemSpiderMan> is there an issue I am missing?

11:35 <+walkerma> Look at the table at the bottom.

11:35 <+walkerma> The natural name for an article of this sort is "tartaric acid"

11:36 <+ChemSpiderMan> Yes...the name is fine

11:36 <+Rifleman_82> you have d and l and meso

11:36 <+walkerma> But there are several stereoisomers, and mixtures

11:36 <+dmacks> One "name" is three compounds, plus there's prolly also a CAS for the racemate.

11:36 <+ChemSpiderMan> The structure is fine...since the structure is NOT stereospecific

11:36 <+walkerma> And there's a CAS for "unspecified" as well, almost certainly

11:36 <+dmacks> (yup)

11:36 <+walkerma> So we can't use CAS as primary key

11:37 <+ChemSpiderMan> The way to specify for each of D/L/meso is to have separate articles

11:37 <+walkerma> I don't think we want that.

11:37 <+ChemSpiderMan> I agree

11:37 <+dmacks> concur strongly.

11:37 <+ChemSpiderMan> so this is fine as is I think

11:38 <+walkerma> (Walkerma considers how many isomers there are for maitoxin)

11:38 <+dmacks> CAS in infobox is (according to the table) the generic for this name.

11:38 <+dmacks> Would we want at least separate data for each compound?

11:38 <+ChemSpiderMan> Going back to what I sense as the issue is the structure drawn should coincide with the article title and all derivatives (SMILES, etc) should be for that

11:39 <+ChemSpiderMan> So, if the article says chloride...show the chloride

11:39 <+ChemSpiderMan> have Chloride in the SMILES

11:39 <+ChemSpiderMan> have CAS for the chloride...not the neutral

11:39 <+ChemSpiderMan> have name for the chloride

11:39 <+walkerma> Concur. We should make this VERY clear in the MOS

11:39 <+ChemSpiderMan> there are many examples where this doesn't happen

11:41 <+walkerma> Hopefully after this sweep there won't be many, and if people are more aware of it this problem won't happen so much in the future?

11:41 <+ChemSpiderMan> I think you are right.

11:41 <+ChemSpiderMan> It's very common with dyes to see no counterion

11:42 <+ChemSpiderMan> http://en.wikipedia.org/wiki/Azorubine has been cleaned up now..

11:42 <+walkerma> That's partly the history - my old UK boss worked in dyes - often they didn't even know the structure of what they made

11:43 <+ChemSpiderMan> no sodium ions before. The name was disodium and the CAS number was 2Na+

11:43 <+ChemSpiderMan> -related so should have been there

11:45 <+walkerma> So shall we agree that 1. We are consistent about counterions between structure, SMILES, CAS etc?

11:45 <+ChemSpiderMan> AGreement from me of course :-)

11:45 <+dmacks> yup

11:45 <+walkerma> 2. We use the Wikipedia article name as the "primary key" (at least for now) for the database, not the CAS?

11:45 <+ChemSpiderMan> for sure!

11:46 <+dmacks> Sounds good.

11:46 <+ChemSpiderMan> The article name "belongs" to wikipedia...CAS numbers don't

11:46 <+ChemSpiderMan> By that I mean that the article exists under the name

11:46 <+ChemSpiderMan> The CAS number is an associated at best

11:46 <+Rifleman_82> hmm

11:46 <+Rifleman_82> the wp article name is not particularly static

11:46 <+Rifleman_82> or rather, there are many which need to be rationalzied

11:46 <+walkerma> It's the way we organize things here, so it's the natural way - unless someone can come up with something better

11:47 <+Rifleman_82> like chloroplatinic acid

11:47 <+ChemSpiderMan> But there is no way to validate CAS or investigate via CAS...CAS is "behind closed doors"

11:47 <+Rifleman_82> c.f. dihydrogen hexachloroplatinate(2-)

11:47 <+Rifleman_82> i'm not rooting for cas

11:47 <+Rifleman_82> i'm saying there may be certain issues

11:47 <+walkerma> Yes, but if you look at the 6000 organics in ChemSpiderMan's collection, there must only be a handful changing their names each month, if that

11:47 <+Rifleman_82> unless we can validate all the names first?

11:47 <+ChemSpiderMan> You can search on CAS numbers but turn up a lot of poor associations

11:47 <+Rifleman_82> all 3000+ of them

11:48 <+ChemSpiderMan> Validate the names? Many of the names are NOT systematic...

11:48 <+ChemSpiderMan> sildenafil.

11:48 <+dmacks> ICANN is "a system" :)

11:48 <+ChemSpiderMan> heme c

11:48 <+Rifleman_82> http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry/IRC_discussions

11:48 <+ChemSpiderMan> many,many,many

11:49 <+Rifleman_82> and the ethanoic/acetic acid discussion...

11:49 <+ChemSpiderMan> chloroplatinic acid is a "common name"

11:49 <+ChemSpiderMan> you are now systematizing.

11:49 <+Rifleman_82> also placed a link to http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry

11:49 <+dmacks> thx

11:49 <+Rifleman_82> check out the big box at the top of the main page

11:51 <+ChemSpiderMan> the question is "what will people search on"...and I think chloroplatici acid is more likely to be searched

11:51 <+ChemSpiderMan> but isn't necessary the "correct name"

11:51 <+Rifleman_82> so long as there are redirects you can call it anything you likie

11:51 <+Rifleman_82> but i guess we need to have certain... arbitrary but consistent standards for naming

11:51 <+ChemSpiderMan> Look at Walkerma and my discussion on DMF...

11:51 <+ChemSpiderMan> Yes, within the ChemBox for sure.

11:52 <+walkerma> I think we'll have to ponder the issue of validating the names. That could be a whole hour of IRC alone.

11:52 <+ChemSpiderMan> The best names possible- can convert back to the right structure for example

11:52 <+Rifleman_82> but chembox should pull the name from the article top

11:52 <+Rifleman_82> from the article name

11:52 <+ChemSpiderMan> Also...IUPAC vs CAs vs Beilstein?

11:52 <+Rifleman_82> the name = param should be used sparingly! if you use it it implies a lack of consistency, and it breaks when the page is moved

11:53 <+ChemSpiderMan> separate discussion...I say Systematic names in the CHemBox at all times (if possible)

11:53 <+ChemSpiderMan> but the article name CAN be systematic but shouldn't have to be...

11:53 <+ChemSpiderMan> otherwise you will be renaming HUNDREDS

11:54 <+ChemSpiderMan> Look on http://en.wikipedia.org/wiki/User:Walkerma/Sandbox

11:54 <+ChemSpiderMan> how many are systematic article TITLES?

11:54 <+ChemSpiderMan> 5%?

11:55 <+walkerma> Actually for inorganics, it's over 50%

11:55 <+walkerma> I'd guess

11:55 <+dmacks> Does an infobox map to the page (i.e., "generic tartaric acid") or to a particular compound (separate infobox for each isomer)?

11:55 <+ChemSpiderMan> sorry...I think you are right for inorganics...

11:56 <+ChemSpiderMan> they are also "easier" in many ways...

11:56 <+walkerma> dmacks raises an important issue

11:56 <+ChemSpiderMan> oxides, sulfides, sulphates etc...but organometallics and organics are not this way

11:57 <+walkerma> generic Tartaric acid does not have a specific MP, solubility, etc

11:57 <+Rifleman_82> generic = rac? or undefined?

11:57 <+dmacks> Rifleman_82: There appears to be a CAS for undefined.

11:57 <+walkerma> generic = undefined

11:57 <+walkerma> Because if it means rac, then we need a separate article on the meso

11:58 <+walkerma> Which we don

11:58 <+walkerma> 't want

11:58 <@Beetstra> For some, like tartaric acid, there could be a page for each .. but most are a problem

11:58 <+ChemSpiderMan> Undefined...

11:58 <+dmacks> If we had separate data table for each *compound* (in whatever sense makes it unique), would be easier to process it to/from databases. Then an article (which could have a title that is less specific than a single compound) could have data for each one.

11:59 <+dmacks> I disagree that "separate page for each compound" is a good solution, since they are often going to be copy'n'pastes of each other with [a]D sign-change.

11:59 <+Rifleman_82> m.p. may change too

12:00 <+Rifleman_82> and IR spectra , but we're not doing IR spectra

12:00 <+walkerma> We can't have separate pages for everything like that

12:00 <+walkerma> I think the tartaric acid article handles some of the data well - a nice table listing things like CAS

12:00 <@Beetstra> No, there are some exceptions .. most don't deserve both

12:00 <+walkerma> But we should really list MPs for each, solubilities, etc

12:00 <@Beetstra> 2/3 chemboxes on one page .. or a generic chembox on the page, and a /datapage?

12:01 <+walkerma> And the alpha-Ds for each of course!

12:01 <@Beetstra> And on the /datapage all chemboxes

12:01 <+Rifleman_82> or can we just stick to the undefined?

12:01 <+Rifleman_82> and prefer anhydrous over monohydrate

12:01 <@Beetstra> i mean, you can make that choice ..

12:01 <+Rifleman_82> prefer freebase over .HCl

12:02 <+Rifleman_82> WP:not CRC handbook?

12:02 <+walkerma> I think, Rifleman_82, WP is (for many people) now their CRC

12:02 <+walkerma> And their Merck index, their Aldrich catalogue

12:03 <+ChemSpiderMan> yes...it is becoming that way

12:03 <+walkerma> That's why people like Antony and Peter are interested in it

12:03 <+ChemSpiderMan> actually it's NOT the info in the ChemBox that's of interest to me at all

12:03 <+ChemSpiderMan> It's the text...

12:04 <+ChemSpiderMan> the descriptions, the history etc

12:04 <+dmacks> We (wikipedia) don't have to be comprehensive, but do need to be specific, and if others want to be able to process it automatically, need *some* systematic format for it.

12:04 <+Rifleman_82> how about a quick round - how many of you guys trust the data in chemboxes?

12:04 <+ChemSpiderMan> The majority of the ChemBox is of little concern (sorry guys)..

12:04 <+ChemSpiderMan> But it DOES need to be right for those who need it.

12:04 <+Rifleman_82> i don't trust it. if it really matters, i'll check CRC

12:05 <+ChemSpiderMan> I don't trust it...

12:05 <+Rifleman_82> if it doesn't matter, if i just want a feel for the state of matter, i'll trust the chembox

12:05 <+walkerma> So you need the Chembox as a "door" to find the text, is that right?

12:05 <@Beetstra> Same for me, trust .. nah .. but generally use it .. if it matters, I check properly

12:05 <+Rifleman_82> having entered many a chembox, i think i know bettter than to trust it

12:05 <+dmacks> Don't trust, but do fix when I find *blatant* errors (which isn't that often)

12:05 <+dmacks> I figure mp ballpark, etc.

12:05 <+Rifleman_82> most chemboxes are entered from MSDS, which are actually not authoritative

12:05 <+ChemSpiderMan> No...I need the article name to find the text...but the ChemBox is supposed to be correct which is why I want it fixed for you guys

12:06 <+Rifleman_82> and diff MSDS' conflict with each other esp bp mp and appearance

12:06 <+ChemSpiderMan> mp and bp are uncommon

12:06 <@Beetstra> One of the problems here is the free access .. anyone can put in anything .. all we can do is 'protect' it

12:06 <@Beetstra> (protect as in 'I don't trust your change, revert')

12:07 <+walkerma> That's the validation/flagged revisions issue, a whole other debate

12:07 <+dmacks> Last time I proposed putting the data on a separate page from the article, so that it would be easier to monitor changes to it.

12:07 <+ChemSpiderMan> The realChemBox content of interest for most people I think is as follows: structure drawing, name, SMILES>

12:07 <+ChemSpiderMan> That's 95% of the value I think..

12:07 <+dmacks> Okay, we'll save that debate for later.

12:07 <+Rifleman_82> agree with ChemSpiderMan

12:07 <+walkerma> I'd like to bring this meeting to a close, if that's OK

12:08 <+ChemSpiderMan> ok

12:08 <+dmacks> Disagree...few care about SMILES, much of target audience cares about general physical properties and mw

12:08 <+dmacks> okay.

12:08 <+Rifleman_82> SMILES is useful

12:08 <+Rifleman_82> cut and paste smiles into chemsketch to generate structure

12:08 <+dmacks> Right, but not to most of who read wikipedia.

12:08 <+dmacks> *whom

12:08 <+ChemSpiderMan> people have no way to generate the structure

12:08 <+Rifleman_82> and for search?

12:09 <+ChemSpiderMan> text-based

12:09 <+ChemSpiderMan> no way to search Wikipedia by structure...SMILES is no good.

12:09 <+ChemSpiderMan> There are too many flavors...they have spaces on WP

12:09 <+ChemSpiderMan> too many issues

12:09 <+dmacks> yeah

12:09 <+ChemSpiderMan> It's the other reason I am doing the project with the SDF generation

12:10 <+ChemSpiderMan> Walkerma...hpow long left...I have a proposal

12:10 <+walkerma> Propose it, and I can always say, "Another day!"

12:11 <+ChemSpiderMan> When the SDF file is done I will supply the following:

12:11 <+ChemSpiderMan> Chemical Structures consistent with the title of the article (or my best suggestion)

12:11 <+ChemSpiderMan> SMILES strings for those structures

12:11 <+ChemSpiderMan> InChI Strings for those structures

12:11 <+ChemSpiderMan> InChiKeys for those structures

12:12 <+ChemSpiderMan> IUPAC names from software (no human bias) generated for those structures

12:12 <+ChemSpiderMan> Mw

12:12 <+ChemSpiderMan> Molecular Formulae

12:12 <+ChemSpiderMan> ALL generated by the same software package

12:12 <+ChemSpiderMan> Now...they will need publishing to WP

12:12 <+ChemSpiderMan> The challenge is as follows:

12:13 <+ChemSpiderMan> I want to have a second/third set of eyes to confirm that the structures are appropriate for the article

12:13 <+ChemSpiderMan> Or..uou can trust me...

12:13 <+ChemSpiderMan> I would prefer you DON'T trust me

12:13 <+walkerma> We can trust a spider, right....?

12:13 <+ChemSpiderMan> It is an exhausting project and tired eyes...

12:13 <+Rifleman_82> haha

12:13 <+dmacks> Very funny, Miss Muffet.

12:14 <+ChemSpiderMan> (they bite you on the bum in Australia)

12:14 <+walkerma> Seriously, I agree, we should double check

12:14 <+ChemSpiderMan> Phew...

12:14 <+dmacks> Yes. If structure is the primary key for "a compound", it needs multiple eyes.

12:14 <+Rifleman_82> SDF = ?

12:14 <+ChemSpiderMan> Concatenated molfile

12:15 <+Rifleman_82> oic

12:15 <+Rifleman_82> replacing pngs?

12:15 <+ChemSpiderMan> I can just send a PDF File associated with each "letter" for now...

12:15 <+ChemSpiderMan> PNGs is image format..not a connection table

12:15 <+Rifleman_82> ok

12:15 <+ChemSpiderMan> I say we "try" a dry run with the letter "A"

12:15 <+walkerma> Better split up A into 2 or 3 PDFs

12:16 <+ChemSpiderMan> hold on..will tell you how big

12:16 <+ChemSpiderMan> about 250 records...

12:17 <+ChemSpiderMan> can split as necessary...how about chunks of 50 records per...

12:17 <+walkerma> That's OK, my workbooks for students are over 200 pages of PDF!

12:17 <+ChemSpiderMan> I am only sometimes validating PubChem links...so I am not taking that one at present

12:18 <+ChemSpiderMan> And CAS...I don't have Scifinder so can NEVER validate...just look for consistency

12:18 <+ChemSpiderMan> with other DBs...

12:18 <+ChemSpiderMan> someone else will need to check CSA

12:18 <+ChemSpiderMan> CAS

12:18 <+walkerma> Anyone here with Scifinder? I have to drive to another college for that

12:19 <+ChemSpiderMan> If we are ready to do this then I can send out the first 50 tonight/tomorrow

12:19 <+Rifleman_82> i have

12:19 <+Rifleman_82> i'll be quite free after next wed

12:19 <@Beetstra> I have .. but for 4000 compounds ..

12:19 <+Rifleman_82> but yeah, for 4k compounds?

12:19 <+ChemSpiderMan> it's expensive...

12:19 <+walkerma> We need to split the work up over several people, and probably several months too

12:19 <+Rifleman_82> can we filter out those easily verified ones?

12:20 <+Rifleman_82> use google to let webpages "vote"?

12:20 <@Beetstra> I have uni-access .. but only limited number of accounts .. gues they will be angry if they see me do this

12:20 <+Rifleman_82> if a lot of relatively reliable web sources agree on the cas, then we let it go?

12:20 <+ChemSpiderMan> this is the problem...that's my approach at present...

12:20 <+Rifleman_82> same thing, we have a limited number of accounts so i can probably only check for half an hour a day

12:20 <+Rifleman_82> it's only the exotic which really need more attention

12:20 <+ChemSpiderMan> CAS might take a while...so be it

12:21 <+Rifleman_82> the cas numbers of ethanol are probably verifiable by google

12:21 <+ChemSpiderMan> I have done the biggest part I "think"

12:21 <+ChemSpiderMan> So ou are "checking"

12:21 <+ChemSpiderMan> should be much faster

12:21 <+walkerma> Yes, thank you for all your work, CSM

12:21 <+ChemSpiderMan> It might be good for a rotation

12:21 <+ChemSpiderMan> One person take the first 50

12:21 <+ChemSpiderMan> Next person the next 50

12:21 <+walkerma> As a student project?

12:21 <+ChemSpiderMan> Your call gents...

12:21 <+ChemSpiderMan> as long as the person "cares"

12:21 <+Rifleman_82> whoever can harness their students, please do so!

12:22 <+Rifleman_82> then, we can divvy up the remaining load

12:22 <+ChemSpiderMan> there needs to be process so that 10 students all aren't reviewing the same stuff.

12:22 <+ChemSpiderMan> Maybe a set of letters to each group?

12:22 <+ChemSpiderMan> or records 1-500, 501-1000, etc

12:22 <+Rifleman_82> or we can all "ask" for a quantity

12:23 <+ChemSpiderMan> yup

12:23 <+Rifleman_82> let's say i want 600 entries

12:23 <+Rifleman_82> so you allocate 600 for me

12:23 <+Rifleman_82> and give it to noone else

12:23 <+ChemSpiderMan> and I can manage the distribution...

12:23 <+Rifleman_82> yeah that's simple

12:23 <+walkerma> Rifleman, could you set up a page on wiki for this?

12:23 <+ChemSpiderMan> agreed...

12:23 <+Rifleman_82> don't think we need to worry about this too much

12:23 <+Rifleman_82> ok

12:23 <+walkerma> OK, should we meet again next week at the same time?

12:23 <+Rifleman_82> sure

12:23 <+Rifleman_82> martin

12:24 <+Rifleman_82> perhaps you or dmacks can briefly summarize the discussion of this and the last meeting

12:24 <+ChemSpiderMan> I can't...sorry. Have a meeting next week a bout adding 250,000 Open access chemistry articles to ChemSpider

12:24 <+Rifleman_82> good luck tony!

12:24 <+ChemSpiderMan> :-)

12:24 <+walkerma> Maybe (if we can get PC to stay) we could discuss InChIs and InChIKeys

12:24 <+Rifleman_82> and post it at http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry/IRC_discussions

12:24 <+ChemSpiderMan> we are indexing International Union of Crystallography back to 1948...fun

12:25 <+Rifleman_82> cifs?

12:25 <+ChemSpiderMan> abstracts and chemical names...andd try to convert to structures..

12:25 <+ChemSpiderMan> PC?

12:25 <+walkerma> Yes, and I'll post about next week on the projects as well.

12:25 <+walkerma> PC = Physchim62

12:25 * dmacks will post log ASAP when we're done here.

12:25 <+Rifleman_82> ok

12:25 <+walkerma> He wrote the InChI script

12:26 <+ChemSpiderMan> I am interested too...

12:26 <+walkerma> I think many of the issues are to do with how we handle these with the wiki markup and formatting

12:26 <+dmacks> yeah, long InChI keys, etc.

12:27 <+walkerma> So it's probably of less interest to you, CSM

12:27 <+ChemSpiderMan> okay...I'm not needed...one comment...do not BREAK the InChI...no spaces

12:27 <+walkerma> YES!

12:27 <+ChemSpiderMan> also, for InChIKeys...there is a powerful way to use them...let me get the link

12:27 <+ChemSpiderMan> http://www.chemspider.com/news/searching-inchikeys-by-connectivities-only-with-and-without-stereo.html

12:28 <+dmacks> WP really needs a way to allow long text strings to be line-breaked in the middle (i.e., not just whitespace)

12:28 <+ChemSpiderMan> search by connectivity and search with stereo

12:28 <+ChemSpiderMan> You WILL need standards for the acceptance of InChIStrngs...they should NOT be generated by the depositor in my opinio

12:28 <+walkerma> dmacks - that's what we need to resolve next time

12:28 <+dmacks> okay

12:28 <+ChemSpiderMan> If you give InChI generation choices you will be in trouble

12:29 <+ChemSpiderMan> Bye

12:29 <+walkerma> OK, I must get on as well

12:29 <+dmacks> åShall we close for today?

12:29 <+walkerma> See you next week?

12:29 <+dmacks> yup

12:29 -!- ChemSpiderMan [[email protected]] has quit []

12:30 <+Rifleman_82> ok

12:30 <+Rifleman_82> http://en.wikipedia.org/wiki/Wikipedia:WikiProject_Chemistry/CAS_validation

12:30 <+Rifleman_82> sorry if it isn't polished, it's late and i'm not fully functioning

12:30 <+walkerma> Thanks, we can polish it later

12:30 <+Rifleman_82> whoever wants to tweak it, please don't wait for me

12:30 <@Beetstra> I am afraid the only reasonable way at the moment is to use a 'InChI' (the correct one) and a DispInChI, the one that is on display, nicely broken where needed

12:30 -!- walkerma [[email protected]] has quit ["ChatZilla 0.9.80 [Firefox 2.0.0.11/2007112718]"]

12:31 <@Beetstra> In that way the right and correct InChI is in the box ..

12:31 <+Rifleman_82> sounds fair enough

12:31 <+Rifleman_82> guys, nice talking

12:31 <+Rifleman_82> i gotta sleep

12:31 <+Rifleman_82> goo dnight!

12:31 -!- Rifleman_82 [n=blahblah@wikipedia/Rifleman-82] has quit []

--- Log closed Tue Jan 22 12:35:03 EST 2008