AI Sauna/Resources: Difference between revisions
< AI Sauna
Content deleted Content added
Juhoinkinen (talk | contribs) No edit summary |
Juhoinkinen (talk | contribs) No edit summary |
||
(13 intermediate revisions by 4 users not shown) | |||
Line 20: | Line 20: | ||
{{AI Sauna/Resource|title=National Library of Finland on Hugging Face|description=A collection of datasets and AI models (Annif and fine-tuned LLMs) published by the National Library of Finland.|name=Osma Suominen|resourcelinks=[https://huggingface.co/NatLibFi Organization]|type=Collaboration platform|image=File:Inside_view_of_dome_of_National_Library_of_Finland_001_(11).jpg}} |
{{AI Sauna/Resource|title=National Library of Finland on Hugging Face|description=A collection of datasets and AI models (Annif and fine-tuned LLMs) published by the National Library of Finland.|name=Osma Suominen|resourcelinks=[https://huggingface.co/NatLibFi Organization]|type=Collaboration platform|image=File:Inside_view_of_dome_of_National_Library_of_Finland_001_(11).jpg}} |
||
{{AI Sauna/Resource|title=A Generated Family of Man|description=An exploratory publication made by the [https://flickr.org Flickr Foundation] in 2023 to investigate and reveal the state of the art of machine-generated captions and imagery.|type=Publication|name=George Oates|resourcelinks=→[https://www.flickr.org/when-past-meets-predictive-an-interview-with-the-curators-of-a-generated-family-of-man/ A Generated Family of Man]}}{{AI Sauna/Resource|title=Avoin data – tarjolla Ylen sisältöjä ja metatietoa|type=Usein kysyttyä|name=Micke Hindsberg|resourcelinks=[https://yle.fi/aihe/artikkeli/2016/05/18/avoin-data-tarjolla-ylen-sisaltoja-ja-metatietoa Avoin data – tarjolla Ylen sisältöjä ja metatietoa]}} |
{{AI Sauna/Resource|title=A Generated Family of Man|description=An exploratory publication made by the [https://flickr.org Flickr Foundation] in 2023 to investigate and reveal the state of the art of machine-generated captions and imagery.|type=Publication|name=George Oates|resourcelinks=→[https://www.flickr.org/when-past-meets-predictive-an-interview-with-the-curators-of-a-generated-family-of-man/ A Generated Family of Man]}}{{AI Sauna/Resource|title=Avoin data – tarjolla Ylen sisältöjä ja metatietoa|type=Usein kysyttyä|name=Micke Hindsberg|resourcelinks=[https://yle.fi/aihe/artikkeli/2016/05/18/avoin-data-tarjolla-ylen-sisaltoja-ja-metatietoa Avoin data – tarjolla Ylen sisältöjä ja metatietoa]}} |
||
{{AI Sauna/Resource|title=Word vectors based on Yle's article corpus|type=Data|name=Micke Hindsberg|resourcelinks=[https://developer.yle.fi/en/data/index.html Word vectors based on Yle's article corpus]}}{{AI Sauna/Resource|title=LUMI Supercomputer|description=Access to Jupyter Notebook on LUMI supercomputer. |
{{AI Sauna/Resource|title=Word vectors based on Yle's article corpus|type=Data|name=Micke Hindsberg|resourcelinks=[https://developer.yle.fi/en/data/index.html Word vectors based on Yle's article corpus]}}{{AI Sauna/Resource|title=LUMI Supercomputer|description=Access to Jupyter Notebook on LUMI supercomputer.|type=Computing|email=mats.sjoberg@csc.fi|name=Mats Sjöberg|resourcelinks=[https://siili.rahtiapp.fi/AI-Sauna-LUMI-access?view LUMI access]}} |
||
{{AI Sauna/Resource |
{{AI Sauna/Resource |
||
| title = Future Audiences / List of experiment ideas |
| title = Future Audiences / List of experiment ideas |
||
Line 53: | Line 53: | ||
| instructions = <!--Add information and links to help and instructions--> |
| instructions = <!--Add information and links to help and instructions--> |
||
}} |
}} |
||
{{AI Sauna/Resource |
{{AI Sauna/Resource |
||
| title = <!--Add a short title--> Harmonized Finnish National Bibliography - Fennica |
| title = <!--Add a short title--> Harmonized Finnish National Bibliography - Fennica |
||
Line 70: | Line 69: | ||
| instructions = <!--Add information and links to help and instructions--> |
| instructions = <!--Add information and links to help and instructions--> |
||
}} |
}} |
||
{{AI Sauna/Resource |
{{AI Sauna/Resource |
||
| title = Photographs from Helsinki City Museum |
| title = Photographs from Helsinki City Museum |
||
Line 87: | Line 85: | ||
| instructions = <!--Add information and links to help and instructions--> |
| instructions = <!--Add information and links to help and instructions--> |
||
}} |
}} |
||
{{AI Sauna/Resource |
{{AI Sauna/Resource |
||
| title = Photographs from Journalistic Picture Archive JOKA |
| title = Photographs from Journalistic Picture Archive JOKA |
||
Line 104: | Line 101: | ||
| instructions = <!--Add information and links to help and instructions--> |
| instructions = <!--Add information and links to help and instructions--> |
||
}} |
}} |
||
{{AI Sauna/Resource |
{{AI Sauna/Resource |
||
| title = Finna metadata |
| title = Finna metadata |
||
| description = A dataset consisting of ca 30M metadata records from the [https://finna.fi/ Finna service]. |
| description = A dataset consisting of ca 30M metadata records from the [https://finna.fi/ Finna service]. |
||
| type = |
| type = Data |
||
| name = Juho Inkinen |
| name = Juho Inkinen |
||
| email = juho.inkinen@helsinki.fi |
| email = juho.inkinen@helsinki.fi |
||
Line 115: | Line 111: | ||
| instructions = <!--Add information and links to help and instructions--> |
| instructions = <!--Add information and links to help and instructions--> |
||
}} |
}} |
||
{{AI Sauna/Resource |
{{AI Sauna/Resource |
||
| title = Qdrant vector database | description = [https://qdrant.tech/documentation/overview/ Qdrant] is a database for storing vectors along with other data items. It can be used for similarity search, multi-modal search, recommendations engines, retrieval-augmented generation (RAG), etc. [https://qdrant.tech/documentation/examples/ |
| title = Qdrant vector database | description = [https://qdrant.tech/documentation/overview/ Qdrant] is a database for storing vectors along with other data items. It can be used for similarity search, multi-modal search, recommendations engines, retrieval-augmented generation (RAG), etc. See a [https://qdrant.tech/documentation/examples/ list] of example applications with end-to-end codes. | type = API | name = Juho Inkinen | email = juho.inkinen@helsinki.fi | resourcelinks = TBA | terms = | instructions = URL and keys for the database API will be provided in person, contact Juho via email or Telegram |
||
}} |
}} |
||
{{AI Sauna/Resource |
{{AI Sauna/Resource |
||
| title = OpenAI GPT3.5-turbo |
| title = OpenAI GPT3.5-turbo |
||
| description = Access to [https://platform.openai.com/docs/models/gpt-3-5-turbo GPT3.5-turbo version 1106] for text generation etc. See [Azure documentation on GPT models]. |
| description = Access to [https://platform.openai.com/docs/models/gpt-3-5-turbo GPT3.5-turbo version 1106] for text generation etc. See [https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/chatgpt?source=recommendations&tabs=python-new Azure documentation on GPT models]. |
||
| type = API |
| type = API |
||
| image = File:OpenAI Logo.svg |
| image = File:OpenAI Logo.svg |
||
| name = Juho Inkinen |
| name = Juho Inkinen |
||
| email = juho.inkinen@helsinki.fi |
| email = juho.inkinen@helsinki.fi |
||
| resourcelinks = |
| resourcelinks = TBA |
||
| instructions = URL and keys for the service API will be provided in person |
| instructions = URL and keys for the service API will be provided in person, contact Juho via email or Telegram |
||
|imgclass=contain}} |
|||
⚫ | |||
{{AI Sauna/Resource |
{{AI Sauna/Resource |
||
| title = Finto AI API |
| title = Finto AI API |
||
Line 146: | Line 139: | ||
| terms = <!--Add information and links to terms and conditions that apply--> |
| terms = <!--Add information and links to terms and conditions that apply--> |
||
| instructions = <!--Add information and links to help and instructions--> |
| instructions = <!--Add information and links to help and instructions--> |
||
|imgclass=contain}} |
|||
⚫ | |||
{{AI Sauna/Resource |
{{AI Sauna/Resource |
||
| title = OpenAI text embedding model Ada-002 |
| title = OpenAI text embedding model Ada-002 |
||
Line 155: | Line 147: | ||
| name = Juho Inkinen |
| name = Juho Inkinen |
||
| email = juho.inkinen@helsinki.fi |
| email = juho.inkinen@helsinki.fi |
||
| resourcelinks = |
| resourcelinks = TBA |
||
| terms = |
| terms = |
||
|instructions=URL and keys for the service API will be provided in person}} |
|instructions=URL and keys for the service API will be provided in person, contact Juho via email or Telegram|imgclass=contain}} |
||
{{AI Sauna/Resource |
|||
| title = Translocalis clippings |
|||
| description = [https://digi.kansalliskirjasto.fi/collections?id=742&set_language=en Translocalis] is a digital database for reader letters written in different locations and published in Finnish papers up to the year 1885. The Translocalis database contains 72 000 reader letters from Finland and abroad. |
|||
| type = Data |
|||
| image = <!--Add an image from Wikimedia Commons--> |
|||
| name = Tuula Pääkkönen, Heikki Kokko |
|||
| email = <!--Add your email address (not required)--> |
|||
| wikiuser = <!--Add your Wikimedia username (not required)--> |
|||
| telegram = <!--Add your Telegram profile (not required)--> |
|||
| mastodon = |
|||
| twitter = |
|||
| linkedin = |
|||
| resourcelinks = - [https://www.kiwi.fi/display/Datacatalog/Translocalis+clippings+1820-1885 Dataset description] - [https://filesender.funet.fi/?s=download&token=819bc8c6-cbce-42c1-ae21-15573c967427 download link] |
|||
| terms = <!--Add information and links to terms and conditions that apply--> |
|||
| instructions = <!--Add information and links to help and instructions--> |
|||
⚫ | |||
{{AI Sauna/Resource |
|||
| title = Linked Data Finland |
|||
| description = A collection of Linked Data from many Finnish cultural heritage organizations and Sampo systems |
|||
| type = Data |
|||
| image = <!--Add an image from Wikimedia Commons--> |
|||
| name = <!--Add your name as a support person--> |
|||
| email = <!--Add your email address (not required)--> |
|||
| wikiuser = <!--Add your Wikimedia username (not required)--> |
|||
| telegram = <!--Add your Telegram profile (not required)--> |
|||
| mastodon = |
|||
| twitter = |
|||
| linkedin = |
|||
| resourcelinks = [https://www.ldf.fi/ LDF.fi] |
|||
| terms = <!--Add information and links to terms and conditions that apply--> |
|||
| instructions = <!--Add information and links to help and instructions--> |
|||
⚫ |
Latest revision as of 13:17, 6 May 2024
– Click here for instructions
Add a new resource:
- Click on Add a new resource.
- Fill in the basic information and click on Publish changes... to save the page.
- Return to this page and edit the entry to add more information.
Edit an existing resource:
- Click on Edit resources above.
- Select the resource you wish to edit. Click Edit
- In the popup window, make the changes you like. You can add more input options from the left panel by ticking one of the blue checkboxes. Confirm by clicking on Apply changes.
- Finally, click on Publish changes... to save the page.
Finna API
API The API provides a way to perform searches to the material provided by the organizations (Finnish libraries, archives and museums) participating in Finna.fi.
Resource links - Swagger
- Python API client library
National Library of Finland on Hugging Face
Collaboration platform A collection of datasets and AI models (Annif and fine-tuned LLMs) published by the National Library of Finland.
Resource links Organization
A Generated Family of Man
Publication An exploratory publication made by the Flickr Foundation in 2023 to investigate and reveal the state of the art of machine-generated captions and imagery.
Resource links →A Generated Family of Man
Avoin data – tarjolla Ylen sisältöjä ja metatietoa
Usein kysyttyä
Resource links Avoin data – tarjolla Ylen sisältöjä ja metatietoa
Word vectors based on Yle's article corpus
Data
Resource links Word vectors based on Yle's article corpus
LUMI Supercomputer
Computing Access to Jupyter Notebook on LUMI supercomputer.
Resource links LUMI access
Future Audiences / List of experiment ideas
Idea list This page outlines experiments for using new technology, like generative AI tools, in the Wikimedia movement to innovate knowledge sharing. These experiments are small-scale and can be quickly executed in hackathons or by volunteer developers. Generated by the Future Audiences team, they aim to inspire Wikimedia community members and others to contribute, discuss, and try out these ideas.
Resource links List of experiment ideas
National Archives of Finland on Hugging Face
Collaboration platform
Resource links Hugging Face Organization
FinBERT-NER - A Finnish named entity recognition model trained to recognize named entities from OCR'd archival data.
Senate Department of Justice records from Finland - A dataset containing the HTR'd text content of a collection of documents produced by the Finnish Senate's Department of Justice between 1900 and 1918.
Early 20th century court records from Finland - A dataset containing the HTR'd text content of a sample of early 20th century court records from Finland.Harmonized Finnish National Bibliography - Fennica
Data Fennica encompasses metadata for over a million documents, including books, newspapers, maps, etc., with records spanning from 1488 to the present. This dataset includes bibliographic details such as author information, titles, publication years, publication locations, publishers, content types, bibliographic levels, and call numbers (signum) indicating the books’ locations within a library.
Resource links fennica [1]
Photographs from Helsinki City Museum
Data A collection of ca 6000 old photographs (until 1917) from the collections of the Helsinki City Museum along with metadata such as captions, keywords, location and photographer. Intended for e.g. generating descriptions or colorizing.
Resource links dataset on HuggingFace Hub
Photographs from Journalistic Picture Archive JOKA
Data A collection of ca 5000 old photographs (until 1940) from the collections of the Journalistic Picture Archive JOKA along with metadata such as captions, keywords, location and photographer. Intended for e.g. generating descriptions or colorizing.
Resource links dataset on Hugging Face Hub
Finna metadata
Data A dataset consisting of ca 30M metadata records from the Finna service.
Resource links dataset on Hugging Face Hub
Qdrant vector database
API Qdrant is a database for storing vectors along with other data items. It can be used for similarity search, multi-modal search, recommendations engines, retrieval-augmented generation (RAG), etc. See a list of example applications with end-to-end codes.
Resource links TBA
Help and instructions URL and keys for the database API will be provided in person, contact Juho via email or Telegram
OpenAI GPT3.5-turbo
API Access to GPT3.5-turbo version 1106 for text generation etc. See Azure documentation on GPT models.
Resource links TBA
Help and instructions URL and keys for the service API will be provided in person, contact Juho via email or Telegram
Finto AI API
API Finto AI — a service based on Annif for automated subject indexing. Finto AI suggests subject headings for texts from a vocabulary to support information retrieval.
Resource links - Swagger-UI API documentation - Python API client library
OpenAI text embedding model Ada-002
API Access to text embedding model Ada-002. Text embeddings are representations of texts as a numerical vectors that encode the meaning of the text. This way the texts that are close in the vector space are expected to be similar in meaning. See Azure tutorial on embeddings.
Resource links TBA
Help and instructions URL and keys for the service API will be provided in person, contact Juho via email or Telegram
Translocalis clippings
Data Translocalis is a digital database for reader letters written in different locations and published in Finnish papers up to the year 1885. The Translocalis database contains 72 000 reader letters from Finland and abroad.
Resource links - Dataset description - download link
Linked Data Finland
Data A collection of Linked Data from many Finnish cultural heritage organizations and Sampo systems
Resource links LDF.fi