this post was submitted on 21 Oct 2023
379 points (97.7% liked)

Technology

58303 readers
4121 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
top 50 comments
sorted by: hot top controversial new old
[–] ElectroNeutrino@lemmy.world 168 points 11 months ago (4 children)

How about just not auto-convert everything and keep the integrity of the data unless specifically asked to? Is that so hard?

[–] Chais@sh.itjust.works 109 points 11 months ago* (last edited 11 months ago) (1 children)

Microsoft assumes their users are complete idiots, even when they (the users) are actively trying to convince them (Microsoft) otherwise. No matter how advanced the feature may be, they'll assume you found instructions somewhere to do something entirely unrelated and they constantly have to save you from yourself. As a result you constantly have to fight the OS for access and control to get it to do what you want.
If you're even a bit of a power user that is, of course.

But more often than not Microsoft's assumption is probably spot on.

[–] WhatAmLemmy@lemmy.world 8 points 11 months ago (1 children)

That assumption is perfectly good for a default. Not a mandatory feature that power users have to live with.

[–] Chais@sh.itjust.works 3 points 11 months ago* (last edited 11 months ago)

As a default, sure. Should be one that's easily changed, though. Repeatedly fighting the machine that's supposed to do your bidding and make your life easier gets old rather quickly. A machine you own and administrate, let's not forget that.

[–] Black616Angel@feddit.de 24 points 11 months ago (1 children)

Excel is inherently flawed in its design.

The thing is, that excel already has half the means of what would be necessary to really fix this bug. That is a field for each cell where the original text can stay.

An excel sheet is just a bunch of XML files zipped in a specific structure. You can unpack a file and look for yourself.
Each worksheet is it's own file and each cell is subdivided into the value and the formula, that generated this value (or nothing, if there is no formula).
Excel could easily fix this issue by adding another possible cell attribute like "original" or "plain" that, when set, allows you to roll back any conversion.

But no, they go a half assed way as always and screw up even more.

[–] RunningInRVA@lemmy.world 14 points 11 months ago (1 children)

In order to do that I think they would first have to ratify a standards change to the Excel format, which is open.

load more comments (1 replies)
load more comments (2 replies)
[–] MelodiousFunk@kbin.social 108 points 11 months ago (2 children)

Me before reading the article: It's got to be dates. Excel thinks everything is a date.

Me after reading the article: Even the workaround is halfhearted. Jeebus.

[–] TwinHaelix@reddthat.com 12 points 11 months ago

Microsoft’s blog adds caveats, such as that Excel avoids the conversion by saving the data as text, which means the data may not work for calculations later. There’s also a known issue where you can’t disable the conversions when running macros.

[–] Redacted@lemm.ee 5 points 11 months ago

Apart from actual dates.

[–] Artyom@lemm.ee 66 points 11 months ago (13 children)

The idea that any scientist is doing data analysis in Excel is honestly terrifying on every level.

[–] kootepe@sopuli.xyz 25 points 11 months ago

You don't want to know...

[–] griffinsklow@feddit.de 20 points 11 months ago

I remember when a biologist asked us for help - Excel crashed on processing his 700MB tables. Took some time and Chatgpt to convince him to do the analysis in R. It worked out in the end and he is now recommending this solution to his colleagues, which is nice.

And is so bad at it that they can't work around this issue.

[–] Blackmist@feddit.uk 5 points 11 months ago

Flashback to the time the UK government lost 16,000 positive COVID patients because Excel has a 1 million row limit.

If only there were better ways of storing large amounts of records with a fixed structure. Maybe the future will provide such technology...

[–] Evotech@lemmy.world 4 points 11 months ago (3 children)

Excel is excellent at data analysis... Python integrations and everything

[–] Artyom@lemm.ee 9 points 11 months ago (1 children)

As an alternative, maybe just Python?

[–] filcuk@lemmy.zip 11 points 11 months ago* (last edited 11 months ago) (1 children)

Because every scientist is also a programmer?
Especially if they struggle to use Excel properly, no chance.

load more comments (1 replies)
load more comments (2 replies)
load more comments (8 replies)
[–] neuropean@kbin.social 48 points 11 months ago

Thank god! You have no idea how awful this is for scientists. Need to paste some gene names down? Better hope it’s not MARCHF8 or in the Septin gene family, otherwise you have to convert columns to text then import the data. Seems like a simple fix, but many wet lab biologists are technologically challenged.

[–] JoBo@feddit.uk 39 points 11 months ago (1 children)

It's no good having this as part of the user options. It should be a sheet characteristic and the default should be "keep cells exactly as entered regardless of data type".

[–] kalleboo@lemmy.world 7 points 11 months ago* (last edited 11 months ago) (3 children)

Changing the default will break the workflows of tens of thousands in the business industry

Scientists should be using something like MATLAB, not Excel.

[–] emergencyfood@sh.itjust.works 6 points 11 months ago

Matlab is used, if at all, by physicists.

We're talking about molecular biologists.

[–] JoBo@feddit.uk 3 points 11 months ago

They're not doing their analysis in Excel. MATLAB solves no problems here?

[–] RheingoldRiver@kbin.social 3 points 11 months ago

You could make a new filetype, default new versions to it, & not break compatibility. Wouldn't do anything for existing workbooks, and keep xlsx an option, but "it would break compatibility" is not a be-all end-all argument against this.

[–] chepox@sopuli.xyz 38 points 11 months ago

"Microsoft’s blog adds caveats, such as that Excel avoids the conversion by saving the data as text, which means the data may not work for calculations later. There’s also a known issue where you can’t disable the conversions when running macros. "

This sounds very half assed...

[–] macrocephalic@lemmy.world 36 points 11 months ago (5 children)

Now if only it would stop dropping leading zeros unless you ask it, and we got rid of the MM/DD/yyyy date format entirely.

[–] theparadox@lemmy.world 8 points 11 months ago (2 children)

Now if only it would stop dropping leading zeros unless you ask it

That appears to actually be a feature.

load more comments (2 replies)
load more comments (4 replies)
[–] autotldr@lemmings.world 31 points 11 months ago (13 children)

This is the best summary I could come up with:


In 2020, scientists decided just to rework the alphanumeric symbols they used to represent genes rather than try to deal with an Excel feature that was interpreting their names as dates and (un)helpfully reformatting them automatically.

Yesterday, a member of the Excel team posted that the company is rolling out an update on Windows and macOS to fix that.

Excel’s automatic conversions are intended to make it easier and faster to input certain types of commonly entered data — numbers and dates, for instance.

But for scientists using quick shorthand to make things legible, it could ruin published, peer-reviewed data, as a 2016 study found.

Microsoft detailed the update in a blog post this week, adding a checkbox labeled “Convert continuous letters and numbers to a date.” You can probably guess what that toggles.

The update builds on the Automatic Data Conversions settings the company added last year, which included the option for Excel to warn you when it’s about to get extra helpful and let you load your file without automatic conversion so you can ensure nothing will be screwed up by it.


The original article contains 225 words, the summary contains 184 words. Saved 18%. I'm a bot and I'm open source!

load more comments (13 replies)
[–] Kethal@lemmy.world 26 points 11 months ago

Microsoft fixes one of the Excel features that wreck scientific data.

[–] MonkderZweite@feddit.ch 26 points 11 months ago

20 years after the problem was first reported.

Meaning there's still hope for XDG support in Firefox?

[–] CatLikeLemming@lemmy.blahaj.zone 19 points 11 months ago (4 children)

This isn't a fix. Excel wasn't meant for this. While I do understand it's convenient as a database, unless you're doing something unimportant and small you just really should use something proper. And even now that this "problem" is gone, I am certain there are still more things that cause trouble. You can not satisfy everyone and Excel was just... not made for gene info storage.

Even if you don't want to use stuff that isn't Microsoft Office, that comes with Microsoft Access, which is a proper database management system. It's literally in the same software package, so why do people refuse to use it?

[–] zalgotext@sh.itjust.works 37 points 11 months ago (3 children)

Why would you need a full blown (shitty) relational database management system to store gene info? Excel should be just fine for storing data in arbitrary tables. It shouldn't make assumptions about your data by default, and changing values that look like they're in a specific format should be opt-in, not default behavior.

load more comments (3 replies)
[–] magikmw@lemm.ee 5 points 11 months ago

Convinient arbitrary table software goes brrrr.

[–] mcc@sh.itjust.works 5 points 11 months ago (2 children)

Do you really don't know why, or are you being sarcastic?

load more comments (2 replies)
[–] echodot@feddit.uk 3 points 11 months ago

I'm so sick of people using Excel for things it's not supposed to be used for.

As a general rule if you're not actually making use of the formula tool, you probably don't need to be using Excel.

[–] detalferous@lemm.ee 14 points 11 months ago

From the article:

The problem of Excel software (Microsoft Corp., Redmond, WA, USA) inadvertently converting gene symbols to dates and floating-point numbers was originally described in 2004 [1]. For example, gene symbols such as SEPT2 (Septin 2) and MARCH1 [Membrane-Associated Ring Finger (C3HC4) 1, E3 Ubiquitin Protein Ligase] are converted by default to ‘2-Sep’ and ‘1-Mar’, respectively. Furthermore, RIKEN identifiers were described to be automatically converted to floating point numbers (i.e. from accession ‘2310009E13’ to ‘2.31E+13’). Since that report, we have uncovered further instances where gene symbols were converted to dates in supplementary data of recently published papers (e.g. ‘SEPT2’ converted to ‘2006/09/02’). This suggests that gene name errors continue to be a problem in supplementary files accompanying articles.

[–] Deebster@programming.dev 14 points 11 months ago* (last edited 11 months ago)

It's too late though, scientists already had to rename the genes. Although of course there are other things that can trigger it, not just in science.

[–] maniel@lemmy.ml 7 points 11 months ago

What about text selection knowing better what I want to select?

[–] Etterra@lemmy.world 3 points 11 months ago (1 children)

Office Libre is free, and modern MS Office UIs looks like dog dookie. OL can also save in Excel format if you want.

Hey look at that, I found a solution that didn't require they change their entire process or have to wait for Microsloughed to get their act together.

load more comments (1 replies)
load more comments
view more: next ›