Copyright and Machine Learning IP

Discussion of anything and everything relating to chess playing software and machines.

Moderators: hgm, Rebel, chrisw

syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Copyright and Machine Learning IP

Post by syzygy »

hgm wrote: Mon Aug 13, 2018 8:49 am
syzygy wrote: Mon Aug 13, 2018 1:17 amOK, so you agree that a random ordering is not protected.
Of course, that was never an issue. Also, the decimal representation of the value of pi would be unprotected (list of facts).
That functional features are not protected is also well established. See e.g. here:
4.2 (...) De aldus door het hof toegepaste maatstaf is juist. Elementen van het werk die louter een technisch effect dienen of te zeer het resultaat zijn van een door technische uitgangspunten beperkte keuze, zijn van auteursrechtelijke bescherming uitgesloten (vgl. HR 22 februari 2013, ECLI:NL:HR:2013:BY1529, NJ 2013/501, rov. 3.4 onder (c), en de aldaar aangehaalde, door het onderdeel ingeroepen BSA-uitspraak van het HvJEU).
IMO "to serve solely a technical effect" here would apply to the requirement that it plays legal Chess, not that it plays good chess. As Chess is yet unsolved, move selection takes place by heuristics, which require creativity to design.
Football is not yet solved, but 11 players running behind a ball according to some strategy designed by their coach are playing to win and not to create a work protected by copyright. I'm not saying there is no creativity in designing a chess engine, but it's another type of creativity than what is needed for copyright.
But even in the rare case where something creative in the end result can somewhat plausible be seen, I suspect there is no room for doubt: "playing style" probably does not legally qualify as something that can be protected. But here I am speculating.
Playing style would merely be the output of the NN, though. I don't want to protect the style, but the NN that generated it.
But the NN itself is not perceivable by the human senses.

There can be copyright on an mp3 file because there is copyright on the music it encodes, but there is no separate copyright on the numbers.
So it is more comparable to the case where I write a pseudo-random number generator. We agree that the output of such a PRNG is not copyrightable. But would the code for the PRNG be copyrightable? It might be far better than any existing PNRG, in terms of metrics used to judge PRNG (e.g. periodicity, long-range correlations).
The algorithm for a PRNG is not copyrighted. The implementation in source code is copyrighted (because the EU Directive says so). Those aspects of the implementation that are determined by the algorithm will not be copyrighted.
It is not at all inconceivable that I could obtain code for a PRNG by randomly generating C programs from the C syntax, with some restriction (claimed by me to be creatively designed) on the syntax (e.g. so that the resulting function always terminates and returns an integer).
"So that the resulting function always terminates and returns an integer" is a functional restriction.
Would that disqualify me from getting copyrights on any PRNG code I carefully designed by hand, with the aid of deep mathematical theories?
So code yes, algorithm no.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Copyright and Machine Learning IP

Post by syzygy »

Fulvio wrote: Mon Aug 13, 2018 10:33 am
hgm wrote: Mon Aug 13, 2018 8:49 am So it is more comparable to the case where I write a pseudo-random number generator. We agree that the output of such a PRNG is not copyrightable. But would the code for the PRNG be copyrightable? It might be far better than any existing PNRG, in terms of metrics used to judge PRNG (e.g. periodicity, long-range correlations).
In my opinion both the code and the output of the PRNG will have copyrights.
For example it will be possible to sell a high quality list of pseudo-random numbers and customers will not have the rights to distribute that list to others.
The output is certainly free of copyright.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Copyright and Machine Learning IP

Post by syzygy »

chrisw wrote: Mon Aug 13, 2018 6:04 pm It's generally thought that a chess game has no copyright, but I think the argument for that view depends on the game being played by two opposing players.
That would at least be my argument, yes.
If two players collude to create a poem on a chess board, they share the copyright on the game.
Not entirely clear that a self-play game produced by a machine, or a collection of such games, could not be claimed copyright by the machine programmer, especially if he was also the operator. Not saying it is, but ...
For now my view is that the author (who must be obviously be a human or a group of humans) must at some level of consciousness have conceived the elements that are in principle copyrightable. If you make an ingenious machine that autonomously develops into being a painter, I would tend to say you are not the author of the paintings (nor is anyone else). As long as the machine is just your instrument, you are the author. If the machine takes over, you stop being the author. But I may turn out to be wrong here.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Copyright and Machine Learning IP

Post by hgm »

syzygy wrote: Mon Aug 13, 2018 9:25 pmFootball is not yet solved, but 11 players running behind a ball according to some strategy designed by their coach are playing to win and not to create a work protected by copyright. I'm not saying there is no creativity in designing a chess engine, but it's another type of creativity than what is needed for copyright.
Again not sure what you want to demonstrate by this. Surely soccer matches are copyrighted; I cannot sit in the stadium filming them with my phone, and broadcasting them on TV or on a website. They ask huge fees for such things.

Are you now arguing that Chess engines cannot be copyrighted at all? I can just copy Houdini, and sell it at a lower price than Robert asks for it?
But the NN itself is not perceivable by the human senses.

There can be copyright on an mp3 file because there is copyright on the music it encodes, but there is no separate copyright on the numbers.
Let me try the same logic on a computer with a (currently) more conventional computer architecture: "i86 code cannot be perceived by the human senses, so there is no copyright on computer programs". Doesn't sound right to me.
The algorithm for a PRNG is not copyrighted. The implementation in source code is copyrighted (because the EU Directive says so). Those aspects of the implementation that are determined by the algorithm will not be copyrighted.
So we seem to agree that source code (and thus presumably also the object code) of a program that produces uncopyrightable output that is virtually indistinguishable from unimaginatively obtained other uncopyrightable output is indeed protected by copyright.
"So that the resulting function always terminates and returns an integer" is a functional restriction.
Yes, so? Virtually every program that is ever written was written to produce output that must satisfy some functional requirements. "Plays Chess at a level >2000 Elo" is a functional requirement. Is a program that does that (and nothing else) not protected by copyright?

Note that the stop problem for Turing machines is undecidable.
Would that disqualify me from getting copyrights on any PRNG code I carefully designed by hand, with the aid of deep mathematical theories?
So code yes, algorithm no.
Just to make sure we are still on the same page:
* If I write code that does virtually the same thing as other code that was generated by a smart automatic process, my code is still protected by copyrights.
* If I write code that does virtually the same thing as other code that was generated by a dumb automatic process, my code is still protected by copyrights.
('Smart' and 'dumb' refer here to the designer of that process having been creative or not.)
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Copyright and Machine Learning IP

Post by syzygy »

Fulvio wrote: Mon Aug 13, 2018 6:37 pm
hgm wrote: Mon Aug 13, 2018 2:20 pm The question is whether there do exist cases where it does matter what training examples are selected, and how these would be judged by existing law.
The training set is a database:
https://en.wikipedia.org/wiki/Database_Directive
and you have to acquire the rights for data mining:
https://en.wikipedia.org/wiki/Data_mini ... n_Europe_2

But after that using the data to train a NN is just an elaborate analysis and the copyright of the database will not apply to the NN.
Sui generis database rights in the EU are a bit more complicated than that.
Art. 1(2) Directive 96/9/EC wrote:For the purposes of this Directive, ‘database’ shall mean a collection of independent works, data or other materials arranged in a systematic or methodical way and individually accessible by electronic or other means.
Let's accept for a moment that the weights of a neural net qualify as "database" in the above sense.
Art. 7(1) Directive 96/9/EC wrote:Member States shall provide for a right for the maker of a database which shows that there has been qualitatively and/or quantitatively a substantial investment in either the obtaining, verification or presentation of the contents to prevent extraction and/or re-utilization of the whole or of a substantial part, evaluated qualitatively and/or quantitatively, of the contents of that database.
So the creator of the NN has to show that "there has been qualitatively and/or quantitatively a substantial investment in either the obtaining, verification or presentation of the contents".

According to C-203/02 (William Hill):
The expression ‘investment in … the obtaining … of the contents’ of a database in Article 7(1) of Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases must be understood to refer to the resources used to seek out existing independent materials and collect them in the database. It does not cover the resources used for the creation of materials which make up the contents of a database.

The expression ‘investment in … the … verification … of the contents’ of a database in Article 7(1) of Directive 96/9 must be understood to refer to the resources used, with a view to ensuring the reliability of the information contained in that database, to monitor the accuracy of the materials collected when the database was created and during its operation. The resources used for verification during the stage of creation of materials which are subsequently collected in a database do not fall within that definition.

The resources used to draw up a list of horses in a race and to carry out checks in that connection do not constitute investment in the obtaining and verification of the contents of the database in which that list appears.
An NN "database" is a collection of weights. The investment in obtaining those weights is the investment in the creation of those weights. According to C-203/02, that investment is not protected by a sui generis database right. Any investment in the seeking out of the games used to create the weights does not count, since those games do not form the contents of the NN "database".

So it seems relatively clear that an NN is not protected by a sui generis EU database right.
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Copyright and Machine Learning IP

Post by chrisw »

syzygy wrote: Mon Aug 13, 2018 10:15 pm
chrisw wrote: Mon Aug 13, 2018 6:04 pm It's generally thought that a chess game has no copyright, but I think the argument for that view depends on the game being played by two opposing players.
That would at least be my argument, yes.
If two players collude to create a poem on a chess board, they share the copyright on the game.
Not entirely clear that a self-play game produced by a machine, or a collection of such games, could not be claimed copyright by the machine programmer, especially if he was also the operator. Not saying it is, but ...
For now my view is that the author (who must be obviously be a human or a group of humans) must at some level of consciousness have conceived the elements that are in principle copyrightable. If you make an ingenious machine that autonomously develops into being a painter, I would tend to say you are not the author of the paintings (nor is anyone else). As long as the machine is just your instrument, you are the author. If the machine takes over, you stop being the author. But I may turn out to be wrong here.
on authors ....
Authors are not necessarily copyright holders and vice versa. In the case of the ingenious machine, you'ld have the case where the author would be de-referenced by the machine from the work. Ideal lawyer-land contractually speaking.

There's a reference in the OP which suggests some jurisdictions already copyright credit works from ingenious machines ..

http://www.wipo.int/wipo_magazine/en/20 ... _0003.html

"In Europe the Court of Justice of the European Union (CJEU) has also declared on various occasions, particularly in its landmark Infopaq decision (C-5/08 Infopaq International A/S v Danske Dagbaldes Forening), that copyright only applies to original works, and that originality must reflect the “author’s own intellectual creation.” This is usually understood as meaning that an original work must reflect the author’s personality, which clearly means that a human author is necessary for a copyright work to exist.

The second option, that of giving authorship to the programmer, is evident in a few countries such as the Hong Kong (SAR), India, Ireland, New Zealand and the UK. This approach is best encapsulated in UK copyright law, section 9(3) of the Copyright, Designs and Patents Act (CDPA), which states:

“In the case of a literary, dramatic, musical or artistic work which is computer-generated, the author shall be taken to be the person by whom the arrangements necessary for the creation of the work are undertaken.”

Furthermore, section 178 of the CDPA defines a computer-generated work as one that “is generated by computer in circumstances such that there is no human author of the work”. The idea behind such a provision is to create an exception to all human authorship requirements by recognizing the work that goes into creating a program capable of generating works, even if the creative spark is undertaken by the machine."


Whether neural net weights can form a "work" is another matter, of course.
User avatar
hgm
Posts: 27788
Joined: Fri Mar 10, 2006 10:06 am
Location: Amsterdam
Full name: H G Muller

Re: Copyright and Machine Learning IP

Post by hgm »

Note that he said the training set was a database, not the NN.
chrisw
Posts: 4313
Joined: Tue Apr 03, 2012 4:28 pm

Re: Copyright and Machine Learning IP

Post by chrisw »

hgm wrote: Mon Aug 13, 2018 10:35 pm
syzygy wrote: Mon Aug 13, 2018 9:25 pmFootball is not yet solved, but 11 players running behind a ball according to some strategy designed by their coach are playing to win and not to create a work protected by copyright. I'm not saying there is no creativity in designing a chess engine, but it's another type of creativity than what is needed for copyright.
Again not sure what you want to demonstrate by this. Surely soccer matches are copyrighted; I cannot sit in the stadium filming them with my phone, and broadcasting them on TV or on a website. They ask huge fees for such things.
[/quote]

Soccer matches are not copyrighted. You cannot sit in the stadium filming them with your phone, and broadcasting them on TV or on a website because there are conditions on which you entered the grounds will either be printed on your ticket or else made plain in some other way with which you contractually agreed when you paid 50 euros for the ticket to get in. Filming not allowed. Breach of contract. Not copyright.

Entering the Louvre and photographing the Mona Lisa is also forbidden. Therefore photos of Mona Lisa unless authorised would have been taken illegally and so on. Likewise interior of Cistine Chapel. Photography not allowed. You want sell pictures? Show contract that overruled the no photography rule. Likewise pop concerts, filming not allowed.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Copyright and Machine Learning IP

Post by syzygy »

hgm wrote: Mon Aug 13, 2018 10:35 pm
syzygy wrote: Mon Aug 13, 2018 9:25 pmFootball is not yet solved, but 11 players running behind a ball according to some strategy designed by their coach are playing to win and not to create a work protected by copyright. I'm not saying there is no creativity in designing a chess engine, but it's another type of creativity than what is needed for copyright.
Again not sure what you want to demonstrate by this. Surely soccer matches are copyrighted; I cannot sit in the stadium filming them with my phone, and broadcasting them on TV or on a website. They ask huge fees for such things.
They can kick you out of the stadium if the small print on your ticket says you're not allowed to film, but you would not be infringing any copyrights. If you do make a recording and contribute a modicum of creativity to that recording, then you own the copyright on that recording.

According to the CJEU in C-403/08:
96 FAPL cannot claim copyright in the Premier League matches themselves, as they cannot be classified as works.

97 To be so classified, the subject-matter concerned would have to be original in the sense that it is its author’s own intellectual creation (see, to this effect, Case C‑5/08 Infopaq International [2009] ECR I‑6569, paragraph 37).

98 However, sporting events cannot be regarded as intellectual creations classifiable as works within the meaning of the Copyright Directive. That applies in particular to football matches, which are subject to rules of the game, leaving no room for creative freedom for the purposes of copyright.

99 Accordingly, those events cannot be protected under copyright. It is, moreover, undisputed that European Union law does not protect them on any other basis in the field of intellectual property.
(The CJEU then points out EU member states are allowed to provide protection on sports games in their national laws. I suppose France may have a law protecting the Tour de France organisers. Organisers of football matches can achieve what they need by controlling what people do in a stadium.)
Are you now arguing that Chess engines cannot be copyrighted at all? I can just copy Houdini, and sell it at a lower price than Robert asks for it?
The code can, the functionality cannot.
Let me try the same logic on a computer with a (currently) more conventional computer architecture: "i86 code cannot be perceived by the human senses, so there is no copyright on computer programs". Doesn't sound right to me.
You can disassemble it and then it makes sense. There might be relatively little copyrightable about it compared to the original source code which certainly will contain more of the programmer's "personal stamp", but a little bit is enough.
The algorithm for a PRNG is not copyrighted. The implementation in source code is copyrighted (because the EU Directive says so). Those aspects of the implementation that are determined by the algorithm will not be copyrighted.
So we seem to agree that source code (and thus presumably also the object code) of a program that produces uncopyrightable output that is virtually indistinguishable from unimaginatively obtained other uncopyrightable output is indeed protected by copyright.
Yes.
"So that the resulting function always terminates and returns an integer" is a functional restriction.
Yes, so? Virtually every program that is ever written was written to produce output that must satisfy some functional requirements. "Plays Chess at a level >2000 Elo" is a functional requirement. Is a program that does that (and nothing else) not protected by copyright?
I thought you meant that that restriction was the creative contribution. I may have misread.

The chess program's source code is protected. Its functionality is not. An independent functionally identical re-implementation would have an independent copyright. (See C-406/10, "neither the functionality of a computer program nor the programming language and the format of data files used in a computer program in order to exploit certain of its functions constitute a form of expression of that program and, as such, are not protected by copyright in computer programs for the purposes of that directive".)
Just to make sure we are still on the same page:
* If I write code that does virtually the same thing as other code that was generated by a smart automatic process, my code is still protected by copyrights.
Yes.
* If I write code that does virtually the same thing as other code that was generated by a dumb automatic process, my code is still protected by copyrights.
('Smart' and 'dumb' refer here to the designer of that process having been creative or not.)
Yes.
syzygy
Posts: 5557
Joined: Tue Feb 28, 2012 11:56 pm

Re: Copyright and Machine Learning IP

Post by syzygy »

chrisw wrote: Mon Aug 13, 2018 10:53 pm There's a reference in the OP which suggests some jurisdictions already copyright credit works from ingenious machines ..

http://www.wipo.int/wipo_magazine/en/20 ... _0003.html

"In Europe the Court of Justice of the European Union (CJEU) has also declared on various occasions, particularly in its landmark Infopaq decision (C-5/08 Infopaq International A/S v Danske Dagbaldes Forening), that copyright only applies to original works, and that originality must reflect the “author’s own intellectual creation.” This is usually understood as meaning that an original work must reflect the author’s personality, which clearly means that a human author is necessary for a copyright work to exist.

The second option, that of giving authorship to the programmer, is evident in a few countries such as the Hong Kong (SAR), India, Ireland, New Zealand and the UK. This approach is best encapsulated in UK copyright law, section 9(3) of the Copyright, Designs and Patents Act (CDPA), which states:

“In the case of a literary, dramatic, musical or artistic work which is computer-generated, the author shall be taken to be the person by whom the arrangements necessary for the creation of the work are undertaken.”

Furthermore, section 178 of the CDPA defines a computer-generated work as one that “is generated by computer in circumstances such that there is no human author of the work”. The idea behind such a provision is to create an exception to all human authorship requirements by recognizing the work that goes into creating a program capable of generating works, even if the creative spark is undertaken by the machine."
It seems to me that in the UK the harmonised EU copyright law has to prevail over UK law. Until Brexit happens, that is...

But certainly almost anything I am saying here in this thread may be wrong outside the EU (although I think most of it applies to the US).

(Not that I cannot be wrong about EU copyright law. But I think I am able to back up quite a bit.)

Thanks for the link btw, an interesting article.