Will AI Chatbots Boost Efforts to Make Scientific Articles Free?

When it comes to accessing the latest scientific articles, there is a severe digital divide. Students and faculty affiliated with most colleges have unlimited access to large collections of scholarship, such as JSTOR and HeinOnline, because their institutions subscribe to site licenses. For everyone else, however, those and many other scientific publications are blocked or can only be read by paying huge fees for each article.

UCLA history professor Peter Baldwin calls it a “grotesque disparity” that many professors don’t even realize. After all, they are spoiled by the easy access to scholarship, and they forget that as soon as their students graduate and leave campus, “you’re kind of banished from the digital paradise of the university world to that dark, inaccessible. the world.”

There is a long-standing call to make scholarship free for all, known as the open access movement. Baldwin argues that this time, when AI and ChatGPT exchange information, could be a tipping point that accelerates the move to open scholarship.

Baldwin’s latest book, Athena Unlimited. why and how scientific knowledge should be free for all’, addresses the history and future of the open access movement. And fittingly, his publisher made a version of the book available for free online.

This professor is not arguing that all information should be free. He is focused on freeing up scholarships that are earned by those who have full-time jobs at colleges and who thus do not expect payment from their writing for a living. In fact, he argues, the whole idea of ​​academic research depends on the free dissemination of work, so that other scientists can build on someone else’s idea or see from another scientist’s work that they can hit a dead end.

A typical open access model makes scientific articles freely available to the public by charging authors a processing fee to publish their work in a journal. And in some cases, it has created a new kind of challenge, because those fees are often paid by college libraries, and not all researchers have equal access to support.

The number of open access journals has grown over the years. But most scientific journals still follow the traditional subscription model, according to recent estimates.

Ed Surge recently caught up with Baldwin to talk about where he sees the movement headed.

Listen to the episode on Apple Podcasts, Overcast, Spotify, Stitcher, or wherever you get your podcasts, or use the player on this page. Or read the partial transcript below, lightly edited for clarity.

EdSurge. How would you describe the state of the open access publishing movement?

Peter Baldwin. It’s clear that we’re going in the right direction, but we’re also going there at very different speeds depending on what kind of content we’re talking about. So for sciences like physics, maths, computer science, they are mostly online. they mostly [post and comment on free pre-prints]. They have effectively solved the problem for themselves. That’s not to say magazines don’t still exist. Mathematics journals, for example, a prominent mathematician told me the other day. He says yes, no of course no one reads the magazines but they are still there.

They’re there because they’re basically used to validate hiring decisions, so that when, you know, a math career is made by getting your paper in the most prestigious math journals, and that validates your application on the job market, but no one actually reads the print version. [because they saw the pre-print].

If universities simply decoupled their promotion, tenure, and hiring decisions from the hierarchies of journal authority, they could put journals entirely out of business insofar as they signal authority.

So this happens in some subjects but not in others. How is that changing that even the humanities have more open access?

One big thing that would move us in this direction would be copyright law reform. I don’t think that will happen anytime soon, because the interests are so confused, mixed, and conflicting that it will be nearly impossible to form some sort of coalition in favor of major copyright reform. But a shortening of the term would be necessary [that a work is covered by copyright]at least for scientific research and their results.

Right now, copyright law has been extended so far. In the beginning, in the late 18th century and early 19th century, when copyright laws were first written, the term was 14 years, and then sometimes you could renew it. So 14 years later, bang, it entered the public domain. Now is the life of the author plus 70 years. So easily over a century. And that’s what makes the fight. And that’s why publishers won’t give it up, because they have this kind of trickery that allows them to have ownership rights to intellectual property far more than we have ownership rights in our homes or anything else that we own. They are practically perpetual property rights.

The reality, of course, is that the vast majority of all books are completely worthless commercially six months after publication, and yet they remain locked up in copyright law for a century. It just doesn’t make sense. It would be much better to say two to three years of commercial value. After two or three years, most of the books are no longer sold. And the few that are bought should, of course, remain in copyright and let publishers and authors make money from them. That is good: But much of it simply no longer has commercial value in any way. And it should be made free. There’s really no reason not to make it free and let people read it without an account.

How would we do that? Do you have a system where if a book doesn’t make X amount of money after two years, it goes into the public domain?

Something like that. Then let’s say it suddenly started being downloaded like crazy, it went viral, then it should be the right of the publisher and author to pull it out of the public domain and put out a new edition or something. I mean, I’m all for letting people who have something commercially valuable make money off of it. I just think that things that are sitting there, locked up and unusable, should be released because it’s good to be released. And there is no downside to this, because nobody loses anything. No one loses readership or revenue or royalties or anything like that.

There is a lot of talk about ChatGPT and other AI systems at the moment. How do you see this movement affecting open access scholarship?

I have two points I want to make about ChatGPT. The first is that American copyright law does not appear to allow copyright protection for anything written by a human being. If that’s true, and it means that nothing ChatGPT exposes is actually copyrighted, then it might just hit the copyright system. Because if 80 percent of our content isn’t copyrighted anymore, what’s the point of copywriting? Then the little bits that are copyrighted, people just ignore it because ChatGPT can do a better job anyway, or certainly did an equally good job of getting around the copyright issue. So it could be that it completely shakes up the entire copyright system.

The second point is that ChatGPT, as I understand it at the moment, scrapes and feeds from the seedy end of the web. That is what it can penetrate. it doesn’t feed off the good stuff on the web. I don’t think it’s capable of getting past paywalls and scientific databases and journals as far as I know. To the extent that this is true, all we get from ChatGPT is garbage, and as much as we want ChatGPT to actually be useful to us and help us, we desperately need it to be. allow access. to: [scholarship].

So in a sense, open access is the key to making ChatGPT work. Because a good ChatGPT should be based on the stuff that paywalls are keeping us away from right now.

What’s the point of having an incredibly powerful tool that only feeds on garbage when you can have an incredibly powerful tool that actually knows the information that’s out there? Presumably, anyone interested in ChatGPT would also be an open access advocate, as they would want ChatGPT to power the good parts of the web as well.

It sounds like people will want to build custom products that feed AI tools like ChatGPT, so maybe each subject has its own research chatbot or something.

Yes, Wikipedia, for example, is toying with the idea of ​​creating a chat wiki that is mostly fed only from Wikipedia, where at least the information has gone through a vetted process and isn’t just blobs.

I have to ask about piracy, because there are still large collections that offer free versions of scholarly articles in violation of copyright. How does this affect legal open access efforts?

Pirates are the open access movement’s best friend, but of course we can’t say that in polite company. We have to beg to differ even as we say that they certainly hold the publisher’s feet to the fire.

You can look back 20 years to the cowboy days of the Internet. Back then we had sites like Megaupload and Pirate Bay and places that took commercial content, mostly pop music and popular movies. [and offered illegal copies for download]. All this has been strengthened by international regulations and thanks to the joint work of countries. They were mostly closed, but what do we have now? We have Spotify and Apple Music and Netflix. It’s obviously not open access, but it’s a reasonable form of open access at a reasonable price. For $13 a month for Amazon Prime, you get, I think, like 15,000 movies and TV shows, you know, as a lending library, that’s not a bad model. And it’s clear that many members of the public have decided they’re willing to pay a reasonable price for reasonable access to a ton of good stuff.

So in the academic world, for scientific knowledge, there are these sites that people go to. In some cases, they’re there because the Russians are funding them to let them stick their noses into the western publishing industry, just to kind of annoy. In other cases, they are funded by contributions and voluntary donations and the like. They’re there because the publishing industry just hasn’t been able to get its act together and deliver content at a reasonable price.

Source link