Why do I copy such large amounts of source material verbatim from sources?

From WhyNotWiki

Jump to: navigation, search

Contents

[edit] Is it copyright infringement?

I don't know; I'm still trying to figure out what fair use is. If it is (that is, if you feel that I have infringed on copyright for your work), then please let me know and I will promptly remove it...

But I think I do know what motivates me to do it:

[edit] Why?

  • star_full.gif star_full.gif star_full.gif Ephemerality of web sites: "Here today, gone tomorrow." Source web pages can go way without warning (I distrust the permanency of my sources). If it was publicly available yesterday, that's no guarantee that it will still be publicly accessible (or even "privately accessible") tomorrow. I want permanence.
  • star_full.gif star_full.gif star_full.gif Organizational integration: I want text/ideas from other sources (besides me) to be integrated into my organizational system/structure (categories, tags, etc.)
  • star_full.gif star_full.gif star_full.gif Annotation: I want to be able to underline the best sections, insert in-line comments, etc.
    • (Reviews and criticism of literary works has always been a fair example of fair use, and you can't really say "that one sentence right over there" was really did without actually repeating the sentence for the readers of your review/critique to see...)
  • star_full.gif star_full.gif star_full.gif Limited time; much to do: I'm a researcher. Collecting good sources is an integral part of research. There's a lot of information to assimilate and it is usually much faster to quote than to paraphrase. So in the interests of time (and because the original wording often cannot be adequately summarized or improved on), I often tend to quote more than paraphrase.
  • star_full.gif star_full.gif star_empty.gifEnables more relevant searching: I've already gone through the work of selectively deciding which sites/pages have interesting/worthwhile content. Now if I want to search through that nice, known-good subset (say, to find again what I found before), how would I do that? If I had only links to the page rather than cached copies, I couldn't really do a full-text search of that subset. Everything needs to be in one database.
  • star_full.gif star_full.gif star_empty.gifEasier to share: For example, to tell someone, "Here's all the (good) material I've come across on [that topic]." Also, it lets me share my marked up / condensed version (should be turn-off-able, though)... "Here it is with my favorite parts highlighted, if you don't have time to read the whole thing through..."
    • Also, see the previous point about "Enables more relevant searching": It's one thing for me to want to search within the content of already seen and visited, but this search within an already-filtered/selective subset of the Web capability would probably be even more useful to those who hadn't seen most of the material before (and for some reason trusted me enough to do their filtering for them as they search for good information about these particular topics...).
  • star_full.gif star_empty.gif star_empty.gifCan make improvements: I can fix typos and other mistakes this way.
  • Visual consistency: I wanted text to be rendered consistently (without extraneous colors, advertisements, etc.), regardless of its source
  • Enables more creative and flexible searching: For example, if I wanted to find out something unusual like how many authors I've cited prefer __ over __ [sorry, I need to think of an example], then having everything stored in a single database would be tremendously helpful towards enabling that kind of introspective querying.
  • A personal library: Sometimes when I read a really good book (etc.) or watch a really good movie, it sort of makes me want to have my own copy, to add it to my own personal collection. Perhaps I never will look at it again, but I still like the feeling that I can pull that book off my shelf at any time and experience it again if I ever want to. The same collector's compulsion applies to digital sources: if it's really good, then it'd be great to have a copy of it for my collection.
  • A reflection of self: By having the sum total of the knowledge that I care about all together in one place, it lets people (including myself, notably) see how widely read I am (so I can show off? perhaps a little bit...), see what kind of material I like, agree with, etc. (lets people get to know me), and (as mentioned above) search within and share (very useful).

I certainly do everything I can to give proper attribution to sources from which I quote so that I'm not guilty of plagiarism, but I'm not sure whether or not my usage of these sources goes beyond what is allowed by "fair use".

[edit] Problem: Source web pages can go way without warning

Wikipedia has tried to work around this problem by finding where there's another copy of the page or where the page moved to, with the help of the Internet Archive, etc. [1]

They may provide an "archive URL" (if they can find one) in the citation that lets one still access it, but it looks like the original URL is still maintained (?).

w:Template:cite web

"Using archiveurl and archivedate to refer to items that went away but are available from an archive site"

More about this problem: Link rot and Web site archiving

[edit] Problem: I want to be able to underline the best sections, and insert in-line comments (annotation)

I'm trying to solve this problem here: Category:Document annotation

[edit] Why don't I ask permission?

You might think that I ought to just ask permission, if I truly believe (as I do) that the copyright holders won't (or shouldn't) have a problem with my use of their material...

That's a legitimate assertion... But the reason I don't typically ask permission is really, really simple...

Because it would take an inordinate amount of time to ask permission from every author whose works I cite. And I don't want to waste my time.

I believe (at least in this case) that it is better to ask forgiveness than to beg permission. It's certainly much easier.

[edit] My philosophy/defense

If the resource really is freely accessible on the web (that is, I can freely link to it), then it doesn't seem like such a departure to me, to go from linking to a page to simply including the contents inline.

Since the page is freely accessible, I could have also just printed the page off and marked it up (written notes on it, underlined it with a pen), but that's very low-tech and has many disadvantages when compared with using a digital copy. Why is there so much concern about making digital copies but nobody cares about people making hard copies?

Also since the page is freely accessible, I could just download a copy of the page to my hard drive for my personal reference. That's okay, right? So the main difference between that and putting it in, say, a wiki is that other people can view it too. If that's a problem, then I should invest in a system that prevents those other than myself from viewing verbatim large quotes from sources.

Whoever published the work on the Web to begin with must have realized that this allows anyone in the world to access the document. They must not have minded that their work would be freely available on the Web (otherwise why did they put it on the Web?). So since it's freely available and accessible on the original site (and I always link back to the source in case anyone wants to see it in the original context), I don't see why the copyright holder would have a problem with it.

(Non-freely-available content, such as for-pay sites like Safari Books which have restricted access require me to log in in order to access the content, are of course a different story... I wouldn't make public what isn't already public. But when it's already out there in the public World Wide Web...?)

In other words... How is it any different from linking to it (, or subscribing to your RSS feed and having the content delivered (copied) into my news reader), or viewing it on your site?

The only difference is that it's framed in a slightly different webpage template/header/footer / context. But the actual content is the same (with the exception of typos, of course, because what literate person can resist the temptation to fix typos??)...

Hopefully they/you will be flattered that I chose to copy their content. They should be especially appreciative if their goal is to get the widest audience for their ideas as possible, because I would be providing free marketing/publicity/exposure...

Most web sites nowadays (blogs, at least), have an RSS feed to encourage syndication. I'm simply syndicating the content as well (?), just in a different way than the usual news-reader + RSS feed...

How is pulling from material from a web site (or other source) and adding a copy of it to my personal digital collection any different than obtaining a copy of a physical book and putting it on my personal bookshelf? If it's a matter of that I didn't pay for it like I did the book, then may I remind you that it was posted on the Internet (assumable by the author) and anyone can already look it for free (at no cost) on the Web, whether or not it's also posted on my site. If it's a matter of price, then just name your microprice to obtain a free copy of your material and I'll be happy to pay it. (I don't mind paying micropayments, but did you know that in general, with the practically 0 cost of reproducing content, the whole idea of expecting to sell content via micropayments is pretty much dead?) Or is the difference that bothers you, that books on my digital bookshelf can only be read by me or the few people who happen to visit my physical abode, whereas sources in my digital online library can be read by anyone anytime anywhere in the world? So it's a matter of me freely sharing what's already freely available?

[edit] A cache

I consider it a cache, no less evil than Google's page cache (or "the Internet Archive"), which I have (and you probably have too) certainly found useful from time to time to access pages that are temporarily or permanently unavailable. If Google can do it, why can't I?

And there are many other caches in use around the world. They speed up page load times for those lucky enough to use them. [source needed]

Personal tools