There is an architectural issue with the data properties of authors (schema:name, schema:familyName and schema:givenName) in property list dt:learningResourceTexts (see dalia.ttl#L75-85) because authors are nested properties of learning resources. As we are using Jena's "one triple equals one document" concept for text indexing, a text search hit in those data properties of authors cannot immediately return the corresponding learning resource. On the other hand, the "one triple equals one document" concept gives more or less consistent scoring across all indexed fields.
Abdel and I have been thinking about an efficient query for "return all learning resources that match the text 'NFDI'":
PREFIXdt:<http://dalia.education/text#>PREFIXec:<https://github.com/tibonto/educor#>PREFIXlist:<http://jena.apache.org/ARQ/list#>PREFIXmo:<https://purl.org/ontology/modalia#>PREFIXrdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#>PREFIXschema:<https://schema.org/>PREFIXtext:<http://jena.apache.org/text#>SELECTDISTINCT?lrWHERE{(?s?score?literal?g?prop)text:query(dt:learningResourceTexts'NFDI').OPTIONAL{FILTERisBlank(?s).?sa?type.FILTER(?type=schema:Person||?type=schema:Organization).?listlist:member?s.# equivalent to ?list rdf:rest*/rdf:first ?s but much faster?lrschema:author?list.}OPTIONAL{FILTER(!isBlank(?s)).?saec:EducationalResource.BIND(?sAS?lr).}# Communities also have dcterms:title and dcterms:description fields,# so these text search matches have to be excluded.FILTERNOTEXISTS{?lramo:Community.}}ORDERBYDESC(?score)
With the current dataset this executes in about 35 milliseconds.
Now the caveat: "return all learning resources that match the text '*'" - the text search will return all indexed triples. About 380 milliseconds. This is due to the iterations in the schema:author lists.